Help language development. Donate to The Perl Foundation
Raku grammar classes for Machine Learning (ML) entities (names.)
The package does entity name recognition using regexes over a set of hashes (dictionaries) that map entity name phrases into entity Identifiers (IDs). The hashes are obtained from the resource files.
In short, we say that the package has grammar-resource-based architecture. The same architecture is used in the Domain Specific Language (DSL) entity packages [AAr3-AAr6].
Remark: It is assumed that the associations between entity name phrases and entity IDs placed in the resource files are going to be changed in the future, because of classifier systems updates, or usage feedback. This is one of the main reasons to use grammar-resource-based architecture: subsequent package versions would have better, fuller associations.
zef install https://github.com/antononcube/Raku-DSL-Entity-MachineLearning.git
use DSL::Entity::MachineLearning; use DSL::Entity::MachineLearning::ResourceAccess; my $pCOMMAND = DSL::Entity::MachineLearning::Grammar; $pCOMMAND.set-resources(DSL::Entity::MachineLearning::resource-access-object()); say $pCOMMAND.parse('DecisionTree', rule => 'machine-learning-entity-command'); say $pCOMMAND.parse('gradient boosted trees', rule => 'machine-learning-entity-command'); say $pCOMMAND.parse('roc curve', rule => 'machine-learning-entity-command');
# ｢DecisionTree｣ # classifier-entity-command => ｢DecisionTree｣ # entity-classifier-name => ｢DecisionTree｣ # 0 => ｢DecisionTree｣ # word-value => ｢DecisionTree｣ # ｢gradient boosted trees｣ # classifier-entity-command => ｢gradient boosted trees｣ # entity-classifier-name => ｢gradient boosted trees｣ # 0 => ｢gradient boosted trees｣ # word-value => ｢gradient｣ # word-value => ｢boosted｣ # word-value => ｢trees｣ # ｢roc curve｣ # classifier-measurement-entity-command => ｢roc curve｣ # entity-classifier-measurement-name => ｢roc curve｣ # 0 => ｢roc curve｣ # word-value => ｢roc｣ # word-value => ｢curve｣
The package provide as Command Line Interface (CLI) to its functionalities:
> ToMachineLearningEntityCode --help # Usage: # ToMachineLearningEntityCode <command> [--target=<Str>] [--user=<Str>] -- Conversion of (natural) DSL machine learning entity name into code. # ToMachineLearningEntityCode <target> <command> [--user=<Str>] -- Both target and command as arguments. # # <command> natural language command (DSL commands) # --target=<Str> target language/system/package (defaults to 'WL-System') [default: 'WL-System'] # --user=<Str> user identifier (defaults to '') [default: ''] # <target> Programming language.
Remark: (Currently) the CLI script always returns results in JSON format.
The resource file:
"ROCFunctionNameToEntityID_EN.csv" uses the names and mappings in [WK1]. (See also the related package [AAr7].)
The initial versions Bulgarian versions of the resource files with name suffix "_BG.csv" were derived by automatic translations of the corresponding English content. Afterwards the Bulgarian mappings were reviewed and manually modified.
[WK1] Wikipedia entry, "Receiver operating characteristic".
[WRI1] Wolfram Research (2014), Classify, Wolfram Language function, https://reference.wolfram.com/language/ref/Classify.html (updated 2021).
[WRI2] Wolfram Research, Inc., Machine Learning Methods.
[WRI3] Wolfram Research (2014), ClassifierMeasurements, Wolfram Language function, https://reference.wolfram.com/language/ref/ClassifierMeasurements.html (updated 2021).
[WRI4] Wolfram Research (1988), Information, Wolfram Language function, https://reference.wolfram.com/language/ref/Information.html (updated 2021).
[AAr1] Anton Antonov, DSL::English::ClassificationWorkflows Raku package, (2020-2022), GitHub/antononcube.