API

BIP39 Validator comes with a powerful API for querying the result of validation tests. The most basic class provided is BIP39WordList. It is responsible for creating a word list object from a file, string buffer or even a URL. BIP39WordList objects are immutable and words can’t be changed, added or removed from the object one they are loaded. To alter the wordlist, you’d need to change it on file and then create a BIP39WordList from it again.

When a test fails, it throws a ValidationFailed exception. This contains a member called status_obj that contains a class with diagnostic information about the test that threw the exception. This object is also returned by the validation test if it succceeds, but the reason there are two different ways to capture the test state is because it’s most common for users to look at the state only if a test fails.

Classes

The most basic class of the BIP39 Validator API is BIP39WordList. This class is responsible for loading the wordlist from an input sorce, such as the local disk or network, and running the multiple validation tests on it. It also stores the line numbers of the words, to enable cross-referencing results with the original file.

Once you instantiate an instance of a BIP39WordList class, there are two different types of operations that can be done. The first group collects metadata about the wordlist, retrieving the words themselves, their line numbers, and also the number of words it has. The second group is the validation tests themselves.

Each validation test is exposed as a BIP39WordList method. They are:

  • test_lowercase(), to perform the well-formed test
  • test_lev_distance(n), to perform the Levenshtein distance test
  • test_initial_chars(n), to perform the initial unique characters test
  • test_max_length(n), to perform the maximum length test

test_lowercase() returns a ValidWordList class on success and throws an InvalidWordList exception on failure. The others return a unique class with the results (LevDistResult, InitUniqResult, and MaxLengthResult respectively) and throw the respective class on failure. The reason for this is so that the same result object can be explored whethe the test succeeded or failed.

Except for test_lowercase() itself, all tests run test_lowercase() before running their own tests to ensure that the wordlist is well-formed.

API Reference

class bip39validator.BIP39WordList(desc, string=None, handle=None, url=None)

Encapsulates a BIP39 wordlist.

__init__(desc, string=None, handle=None, url=None)

Initializes a BIP39WordList object

Words can be read from a string buffer, a file handle or a URL. The precedence order of the inputs is string, then handle, then url. The wordlist is expected to contain one word on each line with no intermediate whitespaces or blank lines (this includes newline at the end of the file), but actual validation of these requirements is not performed here.

Parameters:
  • desc (str) – textual description of the word list
  • string (str, optional) – string buffer to read the words from, defaults to None
  • handle (class:_io.TextIOWrapper, optional) – file handle to read the words from, defaults to None
  • url (str, optional) – URL to read the words from, defaults to None
Raises:
  • ValueErrorstring, handle or url must be specified
  • InvalidWordList – non-lowercase characters in words
desc = None

The list of words.

test_initial_chars(n)

Runs the maximum unique initial characters test.

The maximum unique initial characters test takes each combination of two words in the wordlist and compares their first n characters for equality.

n is required, since there is no use case for analyzing pairs of BIP39 words with arbitrary unique prefixes.

Parameters:n (int) – maximum unique initial characters required.
Returns:an instance of InitUniqResult
Raises:ValidationFailed – <InitUniqResult object>
test_lev_distance(n)

Runs the minimum Levenshtein distance test.

The minimum Levenshtein distance test takes each combination of two words in the wordlist and calculates the Levenshtein distance between them.

Parameters:n (int) – minimum Levenshtein distance required
Returns:an instance of LevDistResult
Raises:ValidationFailed – <LevDistResult object>
test_lowercase()

Checks for forbidden characters in a wordlist.

Checks each word in the wordlist to ensure it only contains lowercase characters on each line, with no empty lines or whitespace anywhere on each line. Trailing newline at the end of the file is also forbidden.

Returns:None
Raises:InvalidWordList – non-lowercase characters in one or more words
test_max_length(n)

Runs the maximum word length test.

Parameters:n – maximum word length allowed

This parameter is required, since there is no use case for analyzing BIP39 words with arbitrary lengths. :type n: int

Returns:an instance of MaxLengthResult
Raises:ValidationFailed – <MaxLengthResult object>
class bip39validator.ValidWordList.ValidWordList(**kwargs)

The wordlist is well-formed and has no invalid characters.

Data structure returned by BIP39WordList.test_lowercase(). This class is not meant to be created directly.

__init__(**kwargs)

Initialize self. See help(type(self)) for accurate signature.

err_lines = None

Indicates if the wordlist file is in sorted order.

has_2048_words = None

The number of words in the wordlist.

has_invalid_chars = False

Tuple of line contents and line numbers of invalid words.

is_sorted = None

Indicates if the wordlist has exactly 2048 words.

class bip39validator.LevDistResult(res, words_sorted, lines_sorted, threshold)

Levenshtein distances between each word pair.

Data structure returned or raised by BIP39WordList.test_lev_distance(). This class is not meant to be created directly.

__init__(res, words_sorted, lines_sorted, threshold)

Initialize self. See help(type(self)) for accurate signature.

getdist(word1, word2)

Gets Levenshtein distance between word1 and word2

Parameters:
  • word1 (str) – first word
  • word2 (str) – second word
Returns:

Levenshtein distance between word1 and word2

getdist_all(word)

Gets Levenshtein distance between word and all other words

Parameters:word (str) – the word
Returns:list of Levenshtein distances between word and each word
getdist_all_eq(word, dist=None)

Gets Levenshtein distance between word and all other words, equal to dist

Parameters:
  • word (str) – the word
  • dist (int) – Levenshtein distance
Returns:

list of Levenshtein distances between word and each word

getdist_all_gt(word, dist=None)

Gets Levenshtein distance between word and all other words, greater than dist

Parameters:
  • word (str) – the word
  • dist (int) – Levenshtein distance
Returns:

list of Levenshtein distances between word and each word

getdist_all_list(word, dists)

Gets Levenshtein distance between word and all other words, inside the list dists

Parameters:
  • word (str) – the word
  • dists (list) – list of Levenshtein distances
Returns:

list of Levenshtein distances between word and each word

getdist_all_lt(word, dist=None)

Gets Levenshtein distance between word and all other words, less than dist

Parameters:
  • word (str) – the word
  • dist (int) – Levenshtein distance
Returns:

list of Levenshtein distances between word and each word

getlinepairs_eq(dist=None)

Gets the line numbers of pairs which have a Levenshtein distance of dist

Parameters:dist (int) – Levenshtein distance
Returns:a list of line pairs
getlinepairs_gt(dist=None)

Gets the line numbers of pairs which have a Levenshtein distance greater than dist

Parameters:dist (int) – Levenshtein distance
Returns:a list of line number pairs
getlinepairs_list(dists)

Gets the line numbers of pairs which have a Levenshtein distance inside the list dists

Parameters:dists (list) – list of Levenshtein distances
Returns:a list of line number pairs
getlinepairs_lt(dist=None)

Gets the line numbers of pairs which have a Levenshtein distance less than dist

Parameters:dist (int) – Levenshtein distance
Returns:a list of line number pairs
getwordpairs_eq(dist=None)

Gets the word pairs which have a Levenshtein distance of dist

Parameters:dist (int) – Levenshtein distance
Returns:a list of word pairs
getwordpairs_gt(dist=None)

Gets the word pairs which have a Levenshtein distance greater than dist

Parameters:dist (int) – Levenshtein distance
Returns:a list of word pairs
getwordpairs_list(dists)

Gets the word pairs which have a Levenshtein distance inside the list dists

Parameters:dists (list) – list of Levenshtein distances
Returns:a list of word pairs
getwordpairs_lt(dist=None)

Gets the word pairs which have a Levenshtein distance less than dist

Parameters:dist (int) – Levenshtein distance
Returns:a list of word pairs
class bip39validator.InitUniqResult(res, threshold)

Initial unique characters (prefix) shared by each word pair.

Data structure returned or raised by BIP39WordList.test_initial_chars(). This class is not meant to be created directly.

__init__(res, threshold)

Initialize self. See help(type(self)) for accurate signature.

groups_length(n)

Gets the list of groups of words and lines of similar prefix length n.

Parameters:n (int) – length of prefixes
Returns:dict of (word, line) tuples grouped by prefixes of length n
similar_linegroup(prefix)

Gets the list of lines of words beginning with prefix

As BIP39WordList sorts its internal copy of the wordlist, the lines in the returned list are sorted in alphabetic order.

Parameters:prefix (str) – the prefix
Returns:list of line numbers of words beginning with prefix
similar_linegroup_all(n)

Gets the entire hash table of lines of words grouped by all prefixes of length n

Parameters:n (int) – prefix length to group by
Returns:dictionary of lines of words grouped by length n prefixes
similar_linegroup_many(prefixes)

Gets the list of lines of words beginning with any of the prefixes in prefixes

As BIP39WordList sorts its internal copy of the wordlist, the lines in the returned list are sorted in alphabetic order.

Parameters:prefixes (str) – list of prefixes
Returns:list of lines of words beginning with any of the prefixes
similar_wordgroup(prefix)

Gets the list of words beginning with prefix

As BIP39WordList sorts its internal copy of the wordlist, the words in the returned list are sorted in alphabetic order.

Parameters:prefix (str) – the prefix
Returns:list of words beginning with prefix
similar_wordgroup_all(n)

Gets the entire hash table of words grouped by all prefixes of length n

As BIP39WordList sorts its internal copy of the wordlist, the words and lines in the returned tuple array are sorted in alphabetic order.

Parameters:n (int) – prefix length to group by
Returns:dictionary of words grouped by length n prefixes
similar_wordgroup_many(prefixes)

Gets the list of words beginning with any of the prefixes in prefixes

As BIP39WordList sorts its internal copy of the wordlist, the words in the returned list are sorted in alphabetic order.

Parameters:prefixes (str) – list of prefixes
Returns:list of words beginning with any of the prefixes
similargroup(prefix)

Gets the list of words and lines beginning with prefix

As BIP39WordList sorts its internal copy of the wordlist, the words and lines in the returned tuple array are sorted in alphabetic order.

Parameters:prefix (str) – the prefix
Returns:list of (word, line) tuples beginning with prefix
similargroup_all(n)

Gets the entire hash table of words and lines grouped by all prefixes of length n

Parameters:n (int) – prefix length to group by
Returns:dictionary of (word, line) tuples grouped by length n prefixes
similargroup_many(prefixes)

Gets the list of words and lines beginning with any of the prefixes in prefixes

As BIP39WordList sorts its internal copy of the wordlist, the words and lines in the returned tuple array are sorted in alphabetic order.

Parameters:prefixes (str) – list of prefixes
Returns:list of (word, line) tuples beginning with any of the prefixes
class bip39validator.MaxLengthResult(res, words_sorted, lines_sorted, threshold)

Length of each word exceeding a certain threshold.

Data structure returned or raised by BIP39WordList.test_max_length(). This class is not meant to be created directly.

__init__(res, words_sorted, lines_sorted, threshold)

Initialize self. See help(type(self)) for accurate signature.

getlines_eq(n)

Gets the line numbers of words which have a length of n

Parameters:n (int) – length
Returns:a list of line numbers
getlines_gt(n=None)

Gets the line numbers which have a length greater than n

Parameters:n (int) – length
Returns:a list of line numbers
getlines_list(lengths)

Gets the line numbers which have a length inside the list lengths

Parameters:lengths (int) – length
Returns:a list of line numbers
getlines_long()

Gets the line numbers of the words that are longer than the threshold tested against.

Returns:a list of line numbers
getlines_lt(n)

Gets the line numbers which have a length less than n

Parameters:n (int) – length
Returns:a list of line numbers
getwords_eq(n)

Gets the words which have a length of n

Parameters:n (int) – length
Returns:a list of words
getwords_gt(n)

Gets the words which have a length greater than n

Parameters:n (int) – length
Returns:a list of words
getwords_list(lengths)

Gets the words which have a length inside the list lengths

Parameters:lengths (list) – list of lengths to check words with
Returns:a list of words
getwords_long()

Gets the words that are longer than the threshold tested against.

Returns:a list of words
getwords_lt(n)

Gets the words which have a length less than n

Parameters:n (int) – length
Returns:a list of words

Exceptions

exception bip39validator.InvalidWordList(**kwargs)

One or more wordlist words have invalid characters.

Exception raised by methods in BIP39WordList. This class is not meant to be created directly.

__init__(**kwargs)

Initialize self. See help(type(self)) for accurate signature.

err_lines = None

Indicates if the wordlist file is in sorted order.

has_2048_words = None

The number of words in the wordlist.

has_invalid_chars = True

Tuple of line contents and line numbers of invalid words.

is_sorted = None

Indicates if the wordlist has exactly 2048 words.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

exception bip39validator.InvalidRemoteContent(url, content_type, expected_content_type)

URL has an unexpected content type

Exception raised by the BIP39WordList constructor. This class is not meant to be created directly.

__init__(url, content_type, expected_content_type)

Initialize self. See help(type(self)) for accurate signature.

content_type = None

The expected content type of url

url = None

The content type of url

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

exception bip39validator.ValidationFailed(status_obj=None)

One of the validation test_vectors has failed.

Exception raised by methods in BIP39WordList. This class is not meant to be created directly.

__init__(status_obj=None)

Initialize self. See help(type(self)) for accurate signature.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.