compromise
Version:
natural language processing in the browser
184 lines (163 loc) • 7.26 kB
Markdown
compromise uses semver, and pushes to npm frequently
(github-releases occasionally)
* **Major** is considered a breaking api change,
* **Minor** is considered a behaviour/performance change.
* **Patch** is an obvious, non-controversial bugfix
### v11
##### 11.1.0
- add `#Multiple` Values tag, and changes to how invalid numbers like 'sixty fifteen hundred' are understood
- better em-dash/en-dash support
- better conjugate implicit verbs inside contractions - "i'm", "we've"
- nouns().articles() method
- neighborhoods as #Place
- support more complex noun-phrases with JustesonKatz in `.nouns()`
<!-- * include 'the #TitleCase' matches in .topics() -->
##### 11.2.1
- rolls-back some aggressive JustesonKatz stuff
- better support for emdash numberRange
- 'can\'t' contraction bugfix
- fix for dates().toShortForm()
##### 11.3.1
- almost-double the support for first-names
- changes to bestTag method
##### 11.4.1
- include old ending punctuation in a `.replace()` cmd
##### 11.5.0
- add #Abbreviation tag
- add #ProperNoun tag
- fixes for noun inflection
##### 11.0.0
- support for persistent lexicon/tagset changes
- `addTags, addWords, addRegs, addPlurals, addConjugations` methods to extend native data
- - `.plugin()` method to wrap all of these into one
- - (removal of `.packWords()` method)
- more `.organizations()` matches
- regex-support in .match() - `nlp('it is waaaay cool').match('/aaa/').out()//'waaaay'`
- improved apostrophe-s disambiguation
- support whitespace before sentence boundary
- improved QuestionWord tagging, some `.questions()` without a question-mark
- phrasalVerb conjugation
- new #Activity tag for Gerunds as nouns 'walking is fun'
- change ngram params to an object `{size:int, max:int}`
- implement '[]' capture-group syntax in .match()
- bring-back `map, filter, foreach and reduce` methods
- set `.words()` as alias for .terms()
- `people().firstNames()`, `people().lastNames()`
- split-out comma-separated adverbs
### v10
- cleanup & rename some `.value()` methods
- change lumping behaviour of lexicon terms with multiple words
- keep more former tags after a term replace method
- new `.random()` method
- new `.lessThan()`, `.greaterThan()`, `.equalTo()` methods
- new prefix/suffix/infix matches with `_ffix` syntax
- `tag()` supports a sequence of tags for a sequence of terms
- .match 'range' queries now use a real match - `#Adverb{2,4}`
- new `.before()` and `.after()` match methods
- removes `.lexicon()` method for many-lexicons concept
- changes params of `.replaceWith()` method to a 'keyTags' boolean
- improved .debug() and logging on client-side
##### 10.1.0
- fix return format of .isPlural(), so it acts like a match filter
- less-greedy date tagging & ambiguous month fixes
##### 10.2.0
- .trim() method,
- adjective tagging fixes
- some new .out() methods
##### 10.3.0
- new `Percent` tag
- lump more units in with `.values()`
##### 10.4.0
- improved tagging of `VerbPhrase` and `Condition`
- fixes to contractions in sentence-changes - "i'm going -> i went"
- several verb conjugation fixes
- accept Terms & Result objects in .match() and .replace()
##### 10.5.0
- add increment/decrement/add/subtract methods to .values()
- add units(), noUnits() methods to .values()
- 'uncountable' nouns are no longer assumed to be singular
- money tag is no longer always a value
##### 10.6.0
- move internal lexicon around, to support new format in v11
- added states & provinces as #Region
- added #Comparable tag for adjectives that conjugate
##### 10.7.0
- improved `places()` parsing
- improved `{min,max}` match syntax
- new `.out('match')` method
- quiet addition of .pack() and .unpack() for owen
##### 10.7.2
- fix for '.watch' reserved word in efrt
### v9
##### 9.0.0
- rename `Term.tag` object to `Term.tags` so the `.tag()` method can work throughout more-consistently
- fix 'Auxillary' tag typo to 'Auxiliary'
- optimisation of .match(), and tagset - significant speedup!
- adds `.tagger()` method and cleanup extra params
- adds `wordStart` and `wordEnd` offsets to `.out('offset')` for whitespace+punctuation
- new `.has()` method for faster lookups
##### 9.1.0
- pretty-real filesize reduction by swapping es6 classes for es5 inheritance
### v8
##### 8.0.0
- less-ambitious date-parsing of nl-date forms
- filesize reduction using [efrt](https://github.com/nlp-compromise/efrt) data structure (254k -> 214k)
- 8.1.0 - add `nlp.tokenize()` method for disabling pos-tagging of input
- 8.2.0 - add `nlp.out('index')` method, 12 bugs
### v7 :postal_horn:
* 7.0.0 - weee! [big change!](https://github.com/nlp-compromise/compromise/wiki/v7-Upgrade,-welcome) *npm package rename*
* 7.0.15 - fix for IE9
### v6
* 6.5.0 - builds now using browserify + derequire()
* 6.4.0 - re-written term-lumper logic
* 6.3.0 - new nlp.lexicon({word:'POS'}) flow
* 6.0.0 - be consistent with `text.normal()`, `term.all_forms()`, `text.word_count()`. `text.normal()` includes sentence-terminators, like periods etc.
### v5
* 5.2.0 - airport codes support, helper methods for specific POS
* 5.1.0 - newlines split sentences
* 5.0.0 - Text methods now return this, instead of array of sentences
### v4
* 4.12.0 - more-sensible responses for invalid, non-string inputs
* 4.11.0 - 14 PRs, with fixes for currencies, pluralization, conjugation
* 4.10.0 - Value.to_text() new method, fix "Posessive" POS typo
* 4.9.0 - return of the text.spot() method (Re:#107)
* 4.8.0 - more aggressive lumping of dates, like 'last week of february'
* 4.7.0 - whitespace reproduction in .text() methods
* 4.6.0 - move negate from sentence to verb & statement
* 4.2.0 - rename 'implicit' to 'expansion' for smarter contractions
* 4.1.3 - added readable-compression to adj, verbs (121kb -> 117kb)
* 4.1.0 - hyphenated words are normalized into spaces
* 4.0.0 - grammar-aware match & replace functions
### v3 **(Breaking)**
* 3.0.2 - Statement & Question classes
* v3.0.0 - Feb 2016
* split ngram, locale, and syllables into plugins in seperate repo
### v2
* v2.0.0 - Nov 2015 **(Breaking)**
* es6 classes, babel building
* better test coverage
* ngram uses term tokenization, so that 'Tony Hawk' us one term, and not two
* more organized pos rules
* Pos tagging is done implicitly now once nlp.Text is run
* Entity spotting is split into .people(), .place(), .organisations()
* unicode normalisation is killed
* opaque two-letter tags are gone
* plugin support
* passive tense detection
* lexicon can be augmented third-party
* date parsing results are different
### v1
* v1.1.0 - May 2015
smarter handling of ambiguous contractions ("he's" -> ["he is", "he has"])
* v1.0.0 - May 2015
added name genders and beginning of co-reference resolution ('Tony' -> 'he') API.
small breaking change on ```Noun.is_plural``` and ```Noun.is_entity```, affording significant pos() speedup. Bumped Major version for these changes.
### v0
* v0.5.2 - May 2015
Phrasal verbs ('step up'), firstnames and .people()
* v0.4.0 - May 2015
Major file-size reduction through refactoring
* v0.3.0 - Jan 2015
New NER choosing algorithm, better capitalisation logic, consolidated tests
* v0.2.0 - Nov 2014
Sentence class methods, client-side demos