Using Python to access the EIDR registry with EIDR REST APIs

Image for post
Image for post

What is EIDR

EIDR is a universal unique identifier system for movie and television assets. From top level franchises, titles, edits, and collections, to series, seasons, episodes, and clips — EIDR provides global unique identifiers for the entire range of singular and serial audiovisual object types that are relevant to both commercial and non-commercial works

EIDR Rest APIs

The EIDR system provides various services using a REST-based interface in combination with HTTP 1.1 (see RFC 2616).

Using Python to access EIDR Registry using EIDR REST API

Pre-requisites

  1. You have Python installed. For this tutorial, we are using Python version 3.7.4. If not, you can download Python from https://www.python.org/ Please refer to Python documentation to install Python on your machine.
  2. You have a Text editor installed on your machine. There are many python friendly editor, which helps writing Python code easier.

Assumption

  1. You know basic Python programming.
  2. You are a member of EIDR and have credentials to access EIDR Registry

Let’s get started…

First thing first, authentication and authorization to log into EIDR. Referring to section 2.1.2 of the EIDR REST API document (Page 8), we need to establish a connection with EIDR registry. To do so, let’s install the requests python packages -

import requests, base64, hashlib
UserID = '10.5238/xxxxxxxx'. # enter your EIDR User Id
Pwd = '************' # enter your EIDR password
PartyID = '10.5237/xxxxxxxxx' # enter your EIDR party ID
url = 'https://sandbox1.eidr.org:443/EIDR/' # EIDR Registry URL
PasswordShadow = base64.b64encode(hashlib.md5(pwd.encode('utf-8')).digest()).decode('utf8')
auth_str = '%s:%s:%s' % (UserID, PartyID, PasswordShadow)
headers = {'Authorization' : 'Eidr {}'.format(auth_str), 'Accept': 'text/xml', 'Content-Type': 'text/xml'}
def getEIDRData(id):
req = url + 'object/' + id + '?type=Full&followAlias=true'
resp = requests.get(req, headers=headers)
#print(resp.content)
return resp.content
eidr_resp = getEIDRData('10.5240/0EF3-54F9-2642-0B49-6829-R')
eidr_resp
Image for post
Image for post
from bs4 import BeautifulSoupsoup = BeautifulSoup(eidr_resp, 'xml')
print(soup.prettify())
Image for post
Image for post
print('Type = {}'.format(soup.FullMetadata.BaseObjectData.ReferentType.contents[0]))
print('Title = {}'.format(soup.FullMetadata.BaseObjectData.ResourceName.contents[0]))
print('Release Year = {}'.format(soup.FullMetadata.BaseObjectData.ReleaseDate.contents[0]))
print('RunTime = {}'.format(soup.FullMetadata.BaseObjectData.ApproximateLength.contents[0]))
print('EIDR ID = {}'.format(soup.FullMetadata.BaseObjectData.ID.contents[0]))
Image for post
Image for post
import requests, base64, hashlib
from bs4 import BeautifulSoup
UserID = '10.5238/xxxxxxxx'. # enter your EIDR User Id
Pwd = '************' # enter your EIDR password
PartyID = '10.5237/xxxxxxxxx' # enter your EIDR party ID
url = 'https://sandbox1.eidr.org:443/EIDR/' # EIDR Registry URL
#Encrypt the credentials
PasswordShadow = base64.b64encode(hashlib.md5(pwd.encode('utf-8')).digest()).decode('utf8')
auth_str = '%s:%s:%s' % (UserID, PartyID, PasswordShadow)
headers = {'Authorization' : 'Eidr {}'.format(auth_str), 'Accept': 'text/xml', 'Content-Type': 'text/xml'}
#getEIDRData Function
def getEIDRData(id):
req = url + 'object/' + id + '?type=Full&followAlias=true'
resp = requests.get(req, headers=headers)
#print(resp.content)
return resp.content
#get the EIDR ID details by calling the getEIDRData function
eidr_resp = getEIDRData('10.5240/0EF3-54F9-2642-0B49-6829-R')
#print the raw response from EIDR
print(eidr_resp)
#format the EIDR response
soup = BeautifulSoup(eidr_resp, 'xml')
print(soup.prettify())
# Extract the specific EIDR values you need from the response using the BeautifulSoup package
print('Type = {}'.format(soup.FullMetadata.BaseObjectData.ReferentType.contents[0]))
print('Title = {}'.format(soup.FullMetadata.BaseObjectData.ResourceName.contents[0]))
print('Release Year = {}'.format(soup.FullMetadata.BaseObjectData.ReleaseDate.contents[0]))
print('RunTime = {}'.format(soup.FullMetadata.BaseObjectData.ApproximateLength.contents[0]))
print('EIDR ID = {}'.format(soup.FullMetadata.BaseObjectData.ID.contents[0]))

Bonus Material

The above code is useful for searching only one EIDR ID, but a good practical use of the above code is to search for multiple EIDR IDs.

10.5240/2B8B-96D7-3142-0F17-C4F1-K
10.5240/2B95-4875-AFEC-2468-BC0C-1
10.5240/2B9B-79F1-F1E6-2FE6-2295-3
10.5240/2B9E-40B4-F563-F31A-1C97-U
10.5240/2B9F-EBC3-0F7F-7112-E388-Z
10.5240/2BA3-6BFA-95D0-7FEF-1DC5-F
10.5240/2BA3-F378-3E87-F338-C560-E
10.5240/2BA9-45B1-2D8D-2A67-C531-Z
10.5240/2BA9-5306-B286-E868-93B8-6
10.5240/35C9-4085-4FE9-8ACA-F464-W
10.5240/35DA-5397-4068-053C-2B23-2
10.5240/35E4-0485-8BA4-FBFE-BC47-3
10.5240/35E5-49E7-7365-B0B1-3848-D
10.5240/35FF-B72E-F96D-7B50-9180-F
10.5240/3613-72AB-2916-044E-8E1C-A
10.5240/3619-02F2-54DF-52A9-B545-Y
10.5240/361A-D1B7-5127-BD94-2196-N
10.5240/3620-6102-F372-9FA5-F26B-A
10.5240/3620-F84D-C5C3-ABA8-9BCE-E
10.5240/3621-D010-2660-5F34-883C-Y
import pandas as pd
inputFile = 'input_eidrs.txt'with open(inputFile) as f:
eidrs = f.read().splitlines()
  1. Loop thru each EIDR ID and pass the id to the getEIDRData function
  2. Extract the required EIDR data values and store in the output_list object
output_list = []
for e in eidrs:
if len(e) == 34: # check the length of EIDR, if it is not = 34, then it is an invalid EIDR ID
eidr_resp = getEIDRData(e)
soup = BeautifulSoup(eidr_resp, 'xml')

Title = soup.FullMetadata.BaseObjectData.ResourceName.contents[0]
Type = soup.FullMetadata.BaseObjectData.ReferentType.contents[0]
ReleaseYear = soup.FullMetadata.BaseObjectData.ReleaseDate.contents[0]
RunTime = soup.FullMetadata.BaseObjectData.ApproximateLength.contents[0]

#row_list = [Title, Type, ReleaseYear, RunTime]
output_list.append([e, Title, Type, ReleaseYear, RunTime])
df_output = pd.DataFrame(output_list, columns =['ID', 'Title', 'Type', 'ReleaseYear', 'Runtime'])
df_output.head()
Image for post
Image for post
df_output.to_excel('eidr_data.xlsx')
import requests, base64, hashlib
from bs4 import BeautifulSoup
import pandas as pd
UserID = '10.5238/xxxxxxxx'. # enter your EIDR User Id
Pwd = '************' # enter your EIDR password
PartyID = '10.5237/xxxxxxxxx' # enter your EIDR party ID
url = 'https://sandbox1.eidr.org:443/EIDR/' # EIDR Registry URL
#Encrypt the credentials
PasswordShadow = base64.b64encode(hashlib.md5(pwd.encode('utf-8')).digest()).decode('utf8')
auth_str = '%s:%s:%s' % (UserID, PartyID, PasswordShadow)
headers = {'Authorization' : 'Eidr {}'.format(auth_str), 'Accept': 'text/xml', 'Content-Type': 'text/xml'}
#getEIDRData Function
def getEIDRData(id):
req = url + 'object/' + id + '?type=Full&followAlias=true'
resp = requests.get(req, headers=headers)
#print(resp.content)
return resp.content
inputFile = 'input_eidrs.txt'with open(inputFile) as f:
eidrs = f.read().splitlines()
#list object to store the EIDR response values
output_list = []
#Loop thru each EIDR ID
for e in eidrs:
# check the length of EIDR, if it is not = 34, then it is an invalid EIDR ID
if len(e) == 34:
#call getEIDRData for each EIDR Id
eidr_resp = getEIDRData(e)
#format the xml output
soup = BeautifulSoup(eidr_resp, 'xml')
#Extract the required values from the EIDR XML
Title = soup.FullMetadata.BaseObjectData.ResourceName.contents[0]
Type = soup.FullMetadata.BaseObjectData.ReferentType.contents[0]
ReleaseYear = soup.FullMetadata.BaseObjectData.ReleaseDate.contents[0]
RunTime = soup.FullMetadata.BaseObjectData.ApproximateLength.contents[0]
#Save the extracted value in a list object
output_list.append([e, Title, Type, ReleaseYear, RunTime])
#use the output list object to create a new dataframe
df_output = pd.DataFrame(output_list, columns =['ID', 'Title', 'Type', 'ReleaseYear', 'Runtime'])
#Display the top 5 records of the dataframe
df_output.head()
#Export the dataframe to excel
df_output.to_excel('eidr_data.xlsx')

Data Management and Operations Strategy

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store