Welcome toVigges Developer Community-Open, Learning,Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
385 views
in Technique[技术] by (71.8m points)

parsing - How to count mistakes in words using Python?

I am making a program which is supposed to translate months of the year from French to English. I started with this code:

import sys

months_list_fr = ['janvier', 'février', 'mars', 'avril', 'mai', 'juin', 'juillet', 'ao?t', 'septembre', 'octobre',
                  'novembre', 'décembre']

months_list_en = ['january', 'February', 'march', 'april', 'may', 'june', 'july', 'august', 'september', 'october',
                  'november', 'december']

months_dict = {
    'janvier': 'January',
    'février': 'February',
    'mars': 'March',
    'avril': 'April',
    'mai': 'May',
    'juin': 'June',
    'juillet': 'July',
    'ao?t': 'August',
    'septembre': 'September',
    'octobre': 'October',
    'novembre': 'November',
    'décembre': 'December'
}


def translate_month(month: str):
    month = month.lower()
    
    # Changes value for the month in english in dict
    if month in months_list_fr:
        month = months_dict[month]
    
    # Simply returns the original english month
    elif month in months_list_en:
        return month
    
    # There must be a mistake
    else:
        print_err_n_exit()
    return month


def print_err_n_exit():
    print(f"There may be a typo in the month or maybe the month is in another language. See module translate.")
    sys.exit()

Here is an example usage:

print(translate_month('janvIer'))

Output:

January

Process finished with exit code 0

Currently, the program will translate the code only in a specific case: when the word appears exactly the same in one of the two lists (case insensitive).

Now, how about the case when I make a mistake in the word:

print(translate_month('janvUer'))

Or the case when I forget a letter:

print(translate_month('janvIe'))

Or again if I add a letter:

print(translate_month('janvIerz'))

In all these cases, my code would not recognize the words and it would output the error message. I am thinking, all these cases have a common point: I only made a single mistake in the word.

What would be an algorithm that would be able to count the mistakes in the words? If so, each of these mistakes would count as one mistake. Being able to count them, it would be easy to translate a word and correct the mistake (when mistakes_count <= 1) or output the error message when a word is too different from the words I am searching (when mistakes_count > 1).


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

you can use pyspellchecker, it's a library made for basic spell checking. It also does support multiple languages.

pyspellchecker supports multiple languages including English, Spanish, German, French, and Portuguese.

You can also set the library to be case sensitive or not by adding it as a parameter to the SpellChecker function SpellChecker(case_sensitive=True)

To learn more: https://pyspellchecker.readthedocs.io/en/latest/, https://readthedocs.org/projects/pyspellchecker/downloads/pdf/latest/


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to Vigges Developer Community for programmer and developer-Open, Learning and Share
...