Client
The tulit Client library supports multiple legal document retrieval sources organized by jurisdiction:
EU Level Clients
Cellar: EU Publications Office SPARQL endpoint for retrieving EU legal documents
Member State Clients
Finland (Finlex): Finnish legal database
France (Legifrance): French legal database
Germany (RIS): German legal information system
Italy (Normattiva): Italian legal database
Luxembourg (Legilux): Luxembourg legal portal
Malta: Maltese legal information
Portugal (DRE): Portuguese official gazette
Spain (BOE): Spanish official gazette
Ireland (Irish Statute Book): Irish legal database
Regional Clients
Veneto: Italian regional legislation (Veneto region)
Base Client
- class tulit.client.client.Client(download_dir, log_dir, proxies=None)
Bases:
objectA generic document downloader class.
- __init__(download_dir, log_dir, proxies=None)
Initializes the downloader with directories for downloads and logs.
- handle_response(response, filename)
Handle a server response by saving or extracting its content.
- get_extension_from_content_type(content_type)
Map Content-Type to a file extension.
EU Clients
- class tulit.client.eu.cellar.CellarClient(download_dir, log_dir, proxies=None)
Bases:
Client- send_sparql_query(sparql_query, celex=None)
Sends a SPARQL query to the EU SPARQL endpoint and stores the results in a JSON file.
- Parameters:
- Return type:
None
- Raises:
FileNotFoundError – If the SPARQL query file is not found.
Exception – If there is an error sending the query or storing the results.
Notes
This function assumes that the SPARQL query file contains a valid SPARQL query. The results are stored in JSON format.
- get_results_table(sparql_query)
Sends a SPARQL query to the EU SPARQL endpoint and returns the results as a JSON object.
- Parameters:
sparql_query (str) – The SPARQL query as a string.
- Returns:
The results of the SPARQL query in JSON format.
- Return type:
- Raises:
Exception – If there is an error sending the query or retrieving the results.
Notes
This function uses the SPARQLWrapper library to send the query and retrieve the results. The results are returned in JSON format.
- fetch_content(url) Response
Send a GET request to download a file
- Parameters:
url (str) – The URL to send the request to.
- Returns:
The response from the server.
- Return type:
requests.Response
Notes
The request is sent with the following headers: - Accept: application/zip;mtype=fmx4, application/xml;mtype=fmx4, application/xhtml+xml, text/html, text/html;type=simplified, application/msword, text/plain, application/xml;notice=object - Accept-Language: eng - Content-Type: application/x-www-form-urlencoded - Host: publications.europa.eu - User-Agent: Browser user agent (required by EU server to bypass bot protection)
- Raises:
requests.RequestException – If there is an error sending the request.
See also
requestsThe underlying library used for making HTTP requests.
- build_request_url(params)
Build the request URL based on the source and parameters.
- get_cellar_ids_from_json_results(results, format)
Extract CELLAR ids from a JSON dictionary.
- Parameters:
cellar_results (dict) – A dictionary containing the response of the CELLAR SPARQL query
- Returns:
A list of CELLAR ids.
- Return type:
Notes
The function assumes that the JSON dictionary has the following structure: - The dictionary contains a key “results” that maps to another dictionary. - The inner dictionary contains a key “bindings” that maps to a list of dictionaries. - Each dictionary in the list contains a key “cellarURIs” that maps to a dictionary. - The innermost dictionary contains a key “value” that maps to a string representing the CELLAR URI.
The function extracts the CELLAR id by splitting the CELLAR URI at “cellar/” and taking the second part.
Examples
>>> cellar_results = { ... "results": { ... "bindings": [ ... {"cellarURIs": {"value": "https://example.com/cellar/some_id"}}, ... {"cellarURIs": {"value": "https://example.com/cellar/another_id"}} ... ] ... } ... } >>> cellar_ids = get_cellar_ids_from_json_results(cellar_results) >>> print(cellar_ids) ['some_id', 'another_id']
- download(celex, format=None, type_id='celex')
Sends a REST query to the specified source APIs and downloads the documents corresponding to the given results.
Member State Clients
Finland (Finlex)
- class tulit.client.state.finlex.FinlexClient(download_dir, log_dir, proxies=None)
Bases:
ClientClient for retrieving legal documents from the Finlex Open Data REST API. API docs: https://opendata.finlex.fi/finlex/avoindata/v1
- BASE_URL = 'https://opendata.finlex.fi/finlex/avoindata/v1'
- download(year, number, lang='fi', doc_type='act', fmt='xml')
Download a statute XML from Finlex Open Data API. Example endpoint: /akn/fi/act/statute/2024/123/fin@
France (Legifrance)
- class tulit.client.state.legifrance.LegifranceClient(client_id, client_secret, download_dir='./data/france/legifrance', log_dir='./data/logs', proxies=None)
Bases:
ClientClient for interacting with the Legifrance API.
The Legifrance API provides access to French legal documents including: - Codes - Laws and decrees (LODA) - Legislative dossiers - Official journals (JORF) - Collective agreements (KALI) - Administrative documents - Case law (JURI) - Parliamentary debates
This client implements the main controllers: - Consult: retrieve specific documents - List: list documents with pagination - Search: search across documents - Suggest: autocomplete suggestions - Chrono: versioned content
- get_token()
Obtain OAuth2 token from the Legifrance authentication service.
- Returns:
Access token for API requests
- Return type:
- consult_code(text_id: str, date: str | None = None, searched_string: str | None = None, sct_cid: str | None = None, abrogated: bool = False, from_suggest: bool = False) Dict[str, Any]
Get the content of a Code.
- Parameters:
text_id (str) – Text identifier (e.g., ‘LEGITEXT000006070721’ for Code Civil)
date (str, optional) – Date for versioned content (format: YYYY-MM-DD)
searched_string (str, optional) – Search string to highlight in the document
sct_cid (str, optional) – Section CID to retrieve specific section
abrogated (bool, optional) – Include abrogated versions (default: False)
from_suggest (bool, optional) – Indicates if request comes from suggest (default: False)
- Returns:
Code content
- Return type:
- consult_law_decree(text_id: str, date: str | None = None, searched_string: str | None = None, abrogated: bool = False, from_suggest: bool = False) Dict[str, Any]
Get the content of a law or decree (LODA).
- Parameters:
text_id (str) – Text identifier
date (str, optional) – Date for versioned content (format: YYYY-MM-DD)
searched_string (str, optional) – Search string to highlight in the document
abrogated (bool, optional) – Include abrogated versions (default: False)
from_suggest (bool, optional) – Indicates if request comes from suggest (default: False)
- Returns:
Law/decree content
- Return type:
- consult_article(article_id: str, date: str | None = None) Dict[str, Any]
Get the content of an article.
- consult_table_matieres(text_id: str, date: str | None = None) Dict[str, Any]
Get the table of contents for a LODA or CODE text.
- consult_legi_part(text_id: str, searched_string: str | None = None, date: str | None = None) Dict[str, Any]
Get partial LEGI text content (used for text fragment retrieval).
- consult_article_with_id_and_num(article_id: str, article_num: str, date: str | None = None) Dict[str, Any]
Get article by ID and number.
- consult_section_by_cid(cid: str, date: str | None = None) Dict[str, Any]
Get section content by CID.
- consult_kali_article(cid: str, article_num: str) Dict[str, Any]
Get collective agreement content from article.
- consult_kali_section(cid: str, section_id: str) Dict[str, Any]
Get collective agreement content from section.
- consult_code_with_ancien_id(ancien_id: str) Dict[str, Any]
Get code by ancien ID (legacy identifier).
- consult_same_num_article(article_id: str) Dict[str, Any]
Get list of articles with the same number.
- consult_concordance_links_article(article_id: str) Dict[str, Any]
Get concordance links for an article.
Get related links for an article.
- consult_service_public_links_article(article_id: str) Dict[str, Any]
Get public service links for an article.
- consult_has_service_public_links_article(text_id: str) Dict[str, Any]
Check which articles have public service links.
- list_codes(page_number: int = 1, page_size: int = 100, date: str | None = None) Dict[str, Any]
List codes with pagination.
- list_loda(page_number: int = 1, page_size: int = 100, date: str | None = None) Dict[str, Any]
List laws and decrees (LODA) with pagination.
- list_dossiers_legislatifs(page_number: int = 1, page_size: int = 100) Dict[str, Any]
List legislative dossiers with pagination.
- list_conventions(page_number: int = 1, page_size: int = 100) Dict[str, Any]
List collective agreements with pagination.
- list_bocc(page_number: int = 1, page_size: int = 100) Dict[str, Any]
List bulletins officiels des conventions collectives (BOCC).
- list_debats_parlementaires(legislature: str | None = None, page_number: int = 1, page_size: int = 100) Dict[str, Any]
List parliamentary debates.
- list_docs_admins(start_year: int, end_year: int, page_number: int = 1, page_size: int = 100) Dict[str, Any]
List administrative documents for a period.
- list_bodmr(start_year: int, end_year: int, page_number: int = 1, page_size: int = 100) Dict[str, Any]
List bulletins officiels des décorations, médailles et récompenses.
- list_questions_ecrites_parlementaires(page_number: int = 1, page_size: int = 100, filters: Dict[str, Any] | None = None) Dict[str, Any]
List parliamentary written questions with pagination.
- list_bocc_texts(page_number: int = 1, page_size: int = 100, filters: Dict[str, Any] | None = None) Dict[str, Any]
List BOCC unit texts with pagination.
- list_boccs_and_texts(page_number: int = 1, page_size: int = 100, filters: Dict[str, Any] | None = None) Dict[str, Any]
List BOCCs and their texts with pagination.
- search(search_query: str, page_number: int = 1, page_size: int = 10, filters: Dict[str, Any] | None = None) Dict[str, Any]
Generic search across indexed documents.
- search_canonical_version(text_id: str, date: str | None = None) Dict[str, Any]
Get canonical version info for a text.
- search_canonical_article_version(article_id: str, date: str | None = None) Dict[str, Any]
Get canonical article versions.
- search_nearest_version(text_id: str, date: str) Dict[str, Any]
Get nearest version info for a text at a given date.
- chrono_text_and_element(text_cid: str, element_cid: str, date: str | None = None) Dict[str, Any]
Get extract from a text version.
- misc_commit_id() Dict[str, Any]
Get deployment and versioning information.
- Returns:
Deployment/version info
- Return type:
- misc_dates_without_jo() Dict[str, Any]
Get list of dates without Official Journal.
- Returns:
List of dates without JO
- Return type:
- misc_years_without_table() Dict[str, Any]
Get list of years without tables.
- Returns:
List of years without tables
- Return type:
- download(endpoint: str, payload: Dict[str, Any], filename: str) str
Download document and save to file.
- download_code(text_id: str, date: str | None = None, searched_string: str | None = None, sct_cid: str | None = None, abrogated: bool = True, from_suggest: bool = True, enrich_articles: bool = False) str
Download a code and save to file.
- Parameters:
text_id (str) – Code identifier
date (str, optional) – Date for versioned content
searched_string (str, optional) – Search string to highlight
sct_cid (str, optional) – Section CID
abrogated (bool, optional) – Include abrogated versions (default: True for sandbox compatibility)
from_suggest (bool, optional) – From suggest (default: True for sandbox compatibility)
enrich_articles (bool, optional) – Fetch full article content for each article (default: False) Warning: Makes one API call per article, can be slow for large codes
- Returns:
Path to saved file
- Return type:
Germany (RIS)
- class tulit.client.state.germany.GermanyClient(download_dir, log_dir, proxies=None)
Bases:
ClientClient for retrieving legal documents from the German RIS (Rechtsinformationssystem) API.
This client supports: - Legislation (laws and decrees) - Case Law (court decisions) - Literature (legal literature)
Base API: https://testphase.rechtsinformationen.bund.de API Documentation: https://docs.rechtsinformationen.bund.de/
- __init__(download_dir, log_dir, proxies=None)
Initialize the Germany RIS client.
- download(document_type: str, format: str = 'html', **kwargs) str
Unified download method for German legal documents.
- Parameters:
document_type (str) – Type of document: ‘legislation’, ‘case_law’, ‘literature’, ‘eli’
format (str, optional) – Format: ‘html’, ‘xml’, ‘zip’ (default ‘html’)
**kwargs –
Additional parameters depending on document_type:
- For ‘legislation’:
jurisdiction, agent, year, natural_identifier, point_in_time, version, language, point_in_time_manifestation, subtype, filename
- For ‘case_law’:
document_number, filename
- For ‘literature’:
document_number, filename
- For ‘eli’:
eli_url, filename
- Returns:
Path to the downloaded file.
- Return type:
Ireland
- class tulit.client.state.irishstatutebook.IrishStatuteBookClient(download_dir, log_dir, proxies=None)
Bases:
ClientClient for retrieving legal documents from the Irish Statute Book (ISB). Example: https://www.irishstatutebook.ie/eli/2012/act/10/enacted/en/xml
- BASE_URL = 'https://www.irishstatutebook.ie/eli'
- download(year, act_number, lang='en', status='enacted', fmt='xml')
Download an Act XML from the Irish Statute Book.
Italy (Normattiva)
- class tulit.client.state.normattiva.NormattivaClient(download_dir, log_dir, proxies=None)
Bases:
Client- fetch_content(uri, url) Response
Send a GET request to download a file
- Parameters:
url (str) – The URL to send the request to.
- Returns:
The response from the server.
- Return type:
requests.Response
- Raises:
requests.RequestException – If there is an error sending the request.
- download(dataGU, codiceRedaz, dataVigenza='20251222', fmt='xml')
Luxembourg (Legilux)
Malta
- class tulit.client.state.malta.MaltaLegislationClient(download_dir, log_dir, proxies=None)
Bases:
ClientClient for retrieving legal documents from the Maltese ELI portal. See: https://legislation.mt/eli
- BASE_URL = 'https://legislation.mt/eli'
- download(eli_path, lang=None, fmt=None)
Download a document from the Maltese ELI portal. eli_path: str, e.g. ‘cap/9’, ‘sl/9.24’, ‘ln/2015/433’, ‘lcbl/49/2004/10’ lang: ‘mlt’ or ‘eng’ (optional) fmt: ‘pdf’, ‘xml’, ‘html’ (optional, currently only ‘pdf’ is supported)
Portugal (DRE)
- class tulit.client.state.portugal.PortugalDREClient(download_dir, log_dir, proxies=None)
Bases:
ClientClient for retrieving legal documents from the Portuguese DRE ELI portal. See: http://data.dre.pt/eli/
- BASE_URL = 'http://data.dre.pt/eli'
- download(document_type, series=None, number=None, year=None, supplement=0, act_type=None, month=None, day=None, region=None, cons_date=None, lang='pt', fmt='html')
Download documents from the Portuguese DRE ELI portal.
- Parameters:
document_type (str) – Type of document: ‘journal’, ‘legal_act’, or ‘consolidated’
series (str, optional) – For journals: series (‘1’, ‘1a’, ‘1b’, etc.)
number (str, optional) – Document number in the year
year (str, optional) – Year of publication
supplement (int, optional) – For journals: supplement number (default 0)
act_type (str, optional) – For legal acts: type (‘lei’, ‘dec-lei’, ‘declegreg’, etc.)
month (str, optional) – For legal acts: month of publication
day (str, optional) – For legal acts: day of publication
region (str, optional) – For legal acts: region (‘p’, ‘m’, ‘a’)
cons_date (str, optional) – For consolidated acts: consolidation date as ‘yyyymmdd’
lang (str, optional) – Language (default ‘pt’)
fmt (str, optional) – Format: ‘html’ or ‘pdf’ (default ‘html’)
- Returns:
Path to downloaded file or None if failed
- Return type:
str or None
Spain (BOE)
Boletín Oficial del Estado (BOE) client.
This module contains the BOEClient class, which is used to download XML files from the BOE API endpoint.
The documentation for the BOE API can be found at https://www.boe.es/datosabiertos/documentos/APIsumarioBOE.pdf