tld package

Submodules

tld.base module

class tld.base.BaseTLDSourceParser[source]

Bases: object

Base TLD source parser.

classmethod get_tld_names(fail_silently: bool = False, retry_count: int = 0)[source]

Get tld names.

Parameters:
  • fail_silently

  • retry_count

Returns:

include_private: bool = True
local_path: str
source_url: str
uid: str | None = None
classmethod update_tld_names(fail_silently: bool = False) bool[source]

Update the local copy of the TLD file.

Parameters:

fail_silently

Returns:

classmethod validate()[source]

Constructor.

class tld.base.Registry(name, bases, attrs)[source]

Bases: type

REGISTRY: Dict[str, BaseTLDSourceParser] = {'mozilla': <class 'tld.utils.MozillaTLDSourceParser'>, 'mozilla_public_only': <class 'tld.utils.MozillaPublicOnlyTLDSourceParser'>}
classmethod get(key: str, default: BaseTLDSourceParser = None) BaseTLDSourceParser | None[source]
classmethod items() ItemsView[str, BaseTLDSourceParser][source]
classmethod reset() None[source]

tld.conf module

tld.conf.get_setting(name: str, default: Any = None) Any

Gets a variable from local settings.

Parameters:
  • name (str)

  • default (mixed) – Default value.

Return mixed:

tld.conf.reset_settings() None

Reset settings.

tld.conf.set_setting(name: str, value: Any) None

Override default settings.

Parameters:
  • name (str)

  • value (mixed)

tld.defaults module

tld.exceptions module

exception tld.exceptions.TldBadUrl(url)[source]

Bases: ValueError

TldBadUrl.

Supposed to be thrown when bad URL is given.

exception tld.exceptions.TldDomainNotFound(domain_name)[source]

Bases: ValueError

TldDomainNotFound.

Supposed to be thrown when domain name is not found (didn’t match) the local TLD policy.

exception tld.exceptions.TldIOError[source]

Bases: OSError

TldIOError.

Supposed to be thrown when problems with reading/writing occur.

exception tld.exceptions.TldImproperlyConfigured[source]

Bases: Exception

TldImproperlyConfigured.

Supposed to be thrown when code is improperly configured. Typical use-case is when user tries to use get_tld function with both search_public and search_private set to False.

tld.helpers module

tld.helpers.PROJECT_DIR(base: str) str

Project dir.

tld.helpers.project_dir(base: str) str[source]

Project dir.

tld.registry module

class tld.registry.Registry(name, bases, attrs)[source]

Bases: type

REGISTRY: Dict[str, BaseTLDSourceParser] = {'mozilla': <class 'tld.utils.MozillaTLDSourceParser'>, 'mozilla_public_only': <class 'tld.utils.MozillaPublicOnlyTLDSourceParser'>}
classmethod get(key: str, default: BaseTLDSourceParser = None) BaseTLDSourceParser | None[source]
classmethod items() ItemsView[str, BaseTLDSourceParser][source]
classmethod reset() None[source]

tld.result module

class tld.result.Result(tld: str, domain: str, subdomain: str, parsed_url: SplitResult)[source]

Bases: object

Container.

domain
property extension: str

Alias of tld.

Return str:

property fld: str

First level domain.

Returns:

Return type:

str

parsed_url
subdomain
property suffix: str

Alias of tld.

Return str:

tld

tld.trie module

class tld.trie.Trie[source]

Bases: object

An adhoc Trie data structure to store tlds in reverse notation order.

add(tld: str, private: bool = False) None[source]
class tld.trie.TrieNode[source]

Bases: object

Class representing a single Trie node.

children
exception
leaf
private

tld.utils module

class tld.utils.BaseMozillaTLDSourceParser[source]

Bases: BaseTLDSourceParser

classmethod get_tld_names(fail_silently: bool = False, retry_count: int = 0) Dict[str, Trie] | None[source]

Parse.

Parameters:
  • fail_silently

  • retry_count

Returns:

class tld.utils.MozillaPublicOnlyTLDSourceParser[source]

Bases: BaseMozillaTLDSourceParser

Mozilla TLD source.

include_private: bool = False
local_path: str = 'res/effective_tld_names_public_only.dat.txt'
source_url: str = 'https://publicsuffix.org/list/public_suffix_list.dat?publiconly'
uid: str = 'mozilla_public_only'
class tld.utils.MozillaTLDSourceParser[source]

Bases: BaseMozillaTLDSourceParser

Mozilla TLD source.

local_path: str = 'res/effective_tld_names.dat.txt'
source_url: str = 'https://publicsuffix.org/list/public_suffix_list.dat'
uid: str = 'mozilla'
class tld.utils.Result(tld: str, domain: str, subdomain: str, parsed_url: SplitResult)[source]

Bases: object

Container.

domain
property extension: str

Alias of tld.

Return str:

property fld: str

First level domain.

Returns:

Return type:

str

parsed_url
subdomain
property suffix: str

Alias of tld.

Return str:

tld
tld.utils.get_fld(url: str | SplitResult, fail_silently: bool = False, fix_protocol: bool = False, search_public: bool = True, search_private: bool = True, parser_class: Type[BaseTLDSourceParser] = None, **kwargs) str | None[source]

Extract the first level domain.

Extract the top level domain based on the mozilla’s effective TLD names dat file. Returns a string. May throw TldBadUrl or TldDomainNotFound exceptions if there’s bad URL provided or no TLD match found respectively.

Parameters:
  • url (str | SplitResult) – URL to get top level domain from.

  • fail_silently (bool) – If set to True, no exceptions are raised and None is returned on failure.

  • fix_protocol (bool) – If set to True, missing or wrong protocol is ignored (https is appended instead).

  • search_public (bool) – If set to True, search in public domains.

  • search_private (bool) – If set to True, search in private domains.

  • parser_class

Returns:

String with top level domain (if as_object argument is set to False) or a tld.utils.Result object (if as_object argument is set to True); returns None on failure.

Return type:

str

tld.utils.get_tld(url: str | SplitResult, fail_silently: bool = False, as_object: bool = False, fix_protocol: bool = False, search_public: bool = True, search_private: bool = True, parser_class: Type[BaseTLDSourceParser] = None) str | Result | None[source]

Extract the top level domain.

Extract the top level domain based on the mozilla’s effective TLD names dat file. Returns a string. May throw TldBadUrl or TldDomainNotFound exceptions if there’s bad URL provided or no TLD match found respectively.

Parameters:
  • url (str | SplitResult) – URL to get top level domain from.

  • fail_silently (bool) – If set to True, no exceptions are raised and None is returned on failure.

  • as_object (bool) – If set to True, tld.utils.Result object is returned, domain, suffix and tld properties.

  • fix_protocol (bool) – If set to True, missing or wrong protocol is ignored (https is appended instead).

  • search_public (bool) – If set to True, search in public domains.

  • search_private (bool) – If set to True, search in private domains.

  • parser_class

Returns:

String with top level domain (if as_object argument is set to False) or a tld.utils.Result object (if as_object argument is set to True); returns None on failure.

Return type:

str

tld.utils.get_tld_names(fail_silently: bool = False, retry_count: int = 0, parser_class: Type[BaseTLDSourceParser] = None) Dict[str, Trie][source]

Build the tlds list if empty. Recursive.

Parameters:
  • fail_silently (bool) – If set to True, no exceptions are raised and None is returned on failure.

  • retry_count (int) – If greater than 1, we raise an exception in order to avoid infinite loops.

  • parser_class (BaseTLDSourceParser)

Returns:

List of TLD names

Return type:

obj:tld.utils.Trie

tld.utils.get_tld_names_container() Dict[str, Trie][source]

Get container of all tld names.

Returns:

Rtype dict:

tld.utils.is_tld(value: str | SplitResult, search_public: bool = True, search_private: bool = True, parser_class: Type[BaseTLDSourceParser] = None) bool[source]

Check if given URL is tld.

Parameters:
  • value (str) – URL to get top level domain from.

  • search_public (bool) – If set to True, search in public domains.

  • search_private (bool) – If set to True, search in private domains.

  • parser_class

Returns:

Return type:

bool

tld.utils.parse_tld(url: str | SplitResult, fail_silently: bool = False, fix_protocol: bool = False, search_public: bool = True, search_private: bool = True, parser_class: Type[BaseTLDSourceParser] = None) Tuple[None, None, None] | Tuple[str, str, str][source]

Parse TLD into parts.

Parameters:
  • url

  • fail_silently

  • fix_protocol

  • search_public

  • search_private

  • parser_class

Returns:

Tuple (tld, domain, subdomain)

Return type:

tuple

tld.utils.pop_tld_names_container(tld_names_local_path: str) None[source]

Remove TLD names container item.

Parameters:

tld_names_local_path

Returns:

tld.utils.process_url(url: str | ~urllib.parse.SplitResult, fail_silently: bool = False, fix_protocol: bool = False, search_public: bool = True, search_private: bool = True, parser_class: ~typing.Type[~tld.base.BaseTLDSourceParser] = <class 'tld.utils.MozillaTLDSourceParser'>) Tuple[List[str], int, SplitResult] | Tuple[None, None, SplitResult][source]

Process URL.

Parameters:
  • parser_class

  • url

  • fail_silently

  • fix_protocol

  • search_public

  • search_private

Returns:

tld.utils.reset_tld_names(tld_names_local_path: str = None) None[source]

Reset the tld_names to empty value.

If tld_names_local_path is given, removes specified entry from tld_names instead.

Parameters:

tld_names_local_path (str)

Returns:

tld.utils.update_tld_names(fail_silently: bool = False, parser_uid: str = None) bool[source]

Update TLD names.

Parameters:
  • fail_silently

  • parser_uid

Returns:

tld.utils.update_tld_names_cli() int[source]

CLI wrapper for update_tld_names.

Since update_tld_names returns True on success, we need to negate the result to match CLI semantics.

tld.utils.update_tld_names_container(tld_names_local_path: str, trie_obj: Trie) None[source]

Update TLD Names container item.

Parameters:
  • tld_names_local_path

  • trie_obj

Returns:

Module contents

class tld.Result(tld: str, domain: str, subdomain: str, parsed_url: SplitResult)[source]

Bases: object

Container.

domain
property extension: str

Alias of tld.

Return str:

property fld: str

First level domain.

Returns:

Return type:

str

parsed_url
subdomain
property suffix: str

Alias of tld.

Return str:

tld
tld.get_fld(url: str | SplitResult, fail_silently: bool = False, fix_protocol: bool = False, search_public: bool = True, search_private: bool = True, parser_class: Type[BaseTLDSourceParser] = None, **kwargs) str | None[source]

Extract the first level domain.

Extract the top level domain based on the mozilla’s effective TLD names dat file. Returns a string. May throw TldBadUrl or TldDomainNotFound exceptions if there’s bad URL provided or no TLD match found respectively.

Parameters:
  • url (str | SplitResult) – URL to get top level domain from.

  • fail_silently (bool) – If set to True, no exceptions are raised and None is returned on failure.

  • fix_protocol (bool) – If set to True, missing or wrong protocol is ignored (https is appended instead).

  • search_public (bool) – If set to True, search in public domains.

  • search_private (bool) – If set to True, search in private domains.

  • parser_class

Returns:

String with top level domain (if as_object argument is set to False) or a tld.utils.Result object (if as_object argument is set to True); returns None on failure.

Return type:

str

tld.get_tld(url: str | SplitResult, fail_silently: bool = False, as_object: bool = False, fix_protocol: bool = False, search_public: bool = True, search_private: bool = True, parser_class: Type[BaseTLDSourceParser] = None) str | Result | None[source]

Extract the top level domain.

Extract the top level domain based on the mozilla’s effective TLD names dat file. Returns a string. May throw TldBadUrl or TldDomainNotFound exceptions if there’s bad URL provided or no TLD match found respectively.

Parameters:
  • url (str | SplitResult) – URL to get top level domain from.

  • fail_silently (bool) – If set to True, no exceptions are raised and None is returned on failure.

  • as_object (bool) – If set to True, tld.utils.Result object is returned, domain, suffix and tld properties.

  • fix_protocol (bool) – If set to True, missing or wrong protocol is ignored (https is appended instead).

  • search_public (bool) – If set to True, search in public domains.

  • search_private (bool) – If set to True, search in private domains.

  • parser_class

Returns:

String with top level domain (if as_object argument is set to False) or a tld.utils.Result object (if as_object argument is set to True); returns None on failure.

Return type:

str

tld.get_tld_names(fail_silently: bool = False, retry_count: int = 0, parser_class: Type[BaseTLDSourceParser] = None) Dict[str, Trie][source]

Build the tlds list if empty. Recursive.

Parameters:
  • fail_silently (bool) – If set to True, no exceptions are raised and None is returned on failure.

  • retry_count (int) – If greater than 1, we raise an exception in order to avoid infinite loops.

  • parser_class (BaseTLDSourceParser)

Returns:

List of TLD names

Return type:

obj:tld.utils.Trie

tld.is_tld(value: str | SplitResult, search_public: bool = True, search_private: bool = True, parser_class: Type[BaseTLDSourceParser] = None) bool[source]

Check if given URL is tld.

Parameters:
  • value (str) – URL to get top level domain from.

  • search_public (bool) – If set to True, search in public domains.

  • search_private (bool) – If set to True, search in private domains.

  • parser_class

Returns:

Return type:

bool

tld.parse_tld(url: str | SplitResult, fail_silently: bool = False, fix_protocol: bool = False, search_public: bool = True, search_private: bool = True, parser_class: Type[BaseTLDSourceParser] = None) Tuple[None, None, None] | Tuple[str, str, str][source]

Parse TLD into parts.

Parameters:
  • url

  • fail_silently

  • fix_protocol

  • search_public

  • search_private

  • parser_class

Returns:

Tuple (tld, domain, subdomain)

Return type:

tuple

tld.update_tld_names(fail_silently: bool = False, parser_uid: str = None) bool[source]

Update TLD names.

Parameters:
  • fail_silently

  • parser_uid

Returns: