Re: Language

These features are pending proof of concept implementation. Currently I’m very busy with my study and my job and Maurice is as well so you probably won’t see anything anytime soon. But to be honest, from my side it’s also laziness. But today I sat down and wanted to come up with some algorithms.

Some definitions for the following paragraphs:

Text: any text. Can be a complete book, a word, a sentence, etc.

LR<Languages>: Language Recognizer for the languages.

Possible applications of the LR

Advanced Texthelper

Features

Can correct words without a dictionary: recognizes when something probably isn’t a word.

A word of length X has an average score of AVG. If there is a word with length X and its score is lower with significance S, then the word probably isn’t a word in the current language. I don’t know how to determine significance S.

Can correct sentences, since it also recognizes sentence structures.

Basically the same as the word recognition, but for sentences.

Can auto-complete sentences (and parts thereof).

Since the LR keeps tracks of how many times a word follows another word, it should be able to predict what word you are going to type, which could speed up writing a text. As you read, a lot of shoulda coulda woulda, but maybe we are clever enough to pull this off.

Language Recognizer

Features

Can tell if text A is more English than text B

If text A scores higher on LR<English> than B, A is more likely to be english than B.

Test results indicate very few false positives with a simple implementation.

Can help for auto-recognizing if something is human readable text.

If a text gets a higher score than previous texts, the text is more likely to be human readable text, which is useful for the “recovery” of encrypted text.

Leave a Reply