Recommend this page to a friend! |
Download |
Info | Documentation | Files | Install with Composer | Download | Reputation | Support forum | Blog | Links |
Ratings | Unique User Downloads | Download Rankings | ||||
Not enough user ratings | Total: 172 This week: 1 | All time: 8,817 This week: 37 |
Version | License | PHP version | Categories | |||
basset-ir 2.52 | GNU Lesser Genera... | 7.1 | Algorithms, PHP 5, Statistics, Text p... |
Description | Author | |||
This package can retrieve, transform and process text documents. Innovation Award
|
Basset is a full-text PHP Information Retrieval library. This is a collection of developments in the field of IR and ported over to PHP for research purposes.
Basset provides different ways of searching through documents in a collection (ad-hoc retrieval), by applying advanced and experimental IR algorithms and/or techniques gathered from different Research studies and Conferences, most notably:
You can read about it here
The Cranfield Collection has been the pioneer collection in information retrieval to validate a system's effectiveness.
I've included the 1400 abstract Cranfield Collection as an XML file that you can parse into separate files.
The test file at tests/sample.php can be executed right away to do the parsing and do a search for a single test query. Customize it to your needs if needed.
You can read Cranfield/cranfield-collection/cranqrel for Glassgow's qrels result.
I've also included SMART system's stopword list for standardization (see stopwords/stopwords.txt).
Files (200) |
File | Role | Description | ||
---|---|---|---|---|
config (1 file) | ||||
Cranfield (1 file, 1 directory) | ||||
src (1 directory) | ||||
stopwords (1 file) | ||||
tests (3 files, 1 directory) | ||||
.travis.yml | Data | Auxiliary data | ||
autoload.php | Aux. | Auxiliary script | ||
composer.json | Data | Auxiliary data | ||
LICENSE | Lic. | License text | ||
README.markdown | Doc. | Documentation |
Files (200) | / | Cranfield |
File | Role | Description | ||
---|---|---|---|---|
cranfield-collection (4 files) | ||||
cranfield_parser.php | Class | Class source |
Files (200) | / | Cranfield | / | cranfield-collection |
File | Role | Description |
---|---|---|
cran.all.1400.xml-format.xml | Data | Auxiliary data |
cran.qry.xml-format.xml | Data | Auxiliary data |
cranqrel | Data | Auxiliary data |
cranqrel.readme | Doc. | Documentation |
Files (200) | / | src | / | Basset |
File | Role | Description | ||
---|---|---|---|---|
Collections (2 files) | ||||
Documents (3 files) | ||||
Expansion (14 files) | ||||
Feature (3 files) | ||||
Index (6 files) | ||||
Math (1 file) | ||||
MetaData (1 file) | ||||
Metric (31 files) | ||||
Models (34 files, 7 directories) | ||||
Normalizers (3 files) | ||||
Results (2 files) | ||||
Search (1 file) | ||||
Statistics (3 files) | ||||
Stemmers (3 files) | ||||
Tokenizers (3 files) | ||||
Utils (4 files) |
Files (200) | / | src | / | Basset | / | Collections |
File | Role | Description |
---|---|---|
CollectionInterface.php | Class | Class source |
CollectionSet.php | Class | Class source |
Files (200) | / | src | / | Basset | / | Documents |
File | Role | Description |
---|---|---|
Document.php | Class | Class source |
DocumentInterface.php | Class | Class source |
TokensDocument.php | Class | Class source |
Files (200) | / | src | / | Basset | / | Expansion |
File | Role | Description |
---|---|---|
CauchyDE.php | Class | Class source |
CauchyDE.php | Class | Class source |
DifferentialEvolution.php | Class | Class source |
Feedback.php | Class | Class source |
GeneticAlgorithm.php | Class | Class source |
IdeDecHi.php | Class | Class source |
IdeRegular.php | Class | Class source |
PRFEAVSMInterface.php | Class | Class source |
PRFInterface.php | Class | Class source |
PRFVSMInterface.php | Class | Class source |
RelevanceModel.php | Class | Class source |
Rocchio.php | Class | Class source |
SelfAdaptiveDE.php | Class | Class source |
SelfAdaptiveDE.php | Class | Class source |
Files (200) | / | src | / | Basset | / | Feature |
File | Role | Description |
---|---|---|
FeatureExtraction.php | Class | Class source |
FeatureInterface.php | Class | Class source |
FeatureVector.php | Class | Class source |
Files (200) | / | src | / | Basset | / | Index |
File | Role | Description |
---|---|---|
Index.php | Class | Class source |
IndexEntry.php | Class | Class source |
IndexInterface.php | Class | Class source |
IndexManager.php | Class | Class source |
IndexReader.php | Class | Class source |
IndexWriter.php | Class | Class source |
Files (200) | / | src | / | Basset | / | Metric |
File | Role | Description |
---|---|---|
BrayCurtisDistance.php | Class | Class source |
CanberraDistance.php | Class | Class source |
ChebyshevDistance.php | Class | Class source |
ChiSquareDistance.php | Class | Class source |
CosineSimilarity.php | Class | Class source |
CzekanowskiSimilarity.php | Class | Class source |
DiceSimilarity.php | Class | Class source |
DistanceInterface.php | Class | Class source |
EuclideanDistance.php | Class | Class source |
HellingerDistance.php | Class | Class source |
JaccardIndex.php | Class | Class source |
JSDivergence.php | Class | Class source |
KLDivergence.php | Class | Class source |
KulczynskiDistance.php | Class | Class source |
LorentzianDistance.php | Class | Class source |
MatusitaDistance.php | Class | Class source |
Metric.php | Class | Class source |
MetricInterface.php | Class | Class source |
MotykaSimilarity.php | Class | Class source |
OverlapCoefficient.php | Class | Class source |
RenyiDivergence.php | Class | Class source |
RuzickaSimilarity.php | Class | Class source |
SimilarityInterface.php | Class | Class source |
SoergleDistance.php | Class | Class source |
SqrtCosineSimilarity.php | Class | Class source |
StamatatosDistance.php | Class | Class source |
test.php | Class | Class source |
TriangleSectorSimilarity.php | Class | Class source |
TverskyIndex.php | Class | Class source |
VectorSimilarity.php | Class | Class source |
VSMInterface.php | Class | Class source |
Files (200) | / | src | / | Basset | / | Models |
File | Role | Description | ||
---|---|---|---|---|
Contracts (5 files) | ||||
DFIModels (5 files) | ||||
DFRAfterEffect (4 files) | ||||
DFRModels (8 files) | ||||
IBDistribution (3 files) | ||||
IBLambda (4 files) | ||||
Normalization (11 files) | ||||
AbsoluteDiscountingLM.php | Class | Class source | ||
AtireBM25.php | Class | Class source | ||
BaseIdf.php | Class | Class source | ||
BM25.php | Class | Class source | ||
BM25L.php | Class | Class source | ||
BM25Plus.php | Class | Class source | ||
BSDS.php | Class | Class source | ||
DFIModel.php | Class | Class source | ||
DFRModel.php | Class | Class source | ||
DirichletLM.php | Class | Class source | ||
DirichletSPUD.php | Class | Class source | ||
HiemstraLM.php | Class | Class source | ||
IBModel.php | Class | Class source | ||
Idf.php | Class | Class source | ||
IdfDFR.php | Class | Class source | ||
IdfOkapi.php | Class | Class source | ||
IdfSparckRobertson.php | Class | Class source | ||
IRRA12.php | Class | Class source | ||
JelinekMercerLM.php | Class | Class source | ||
JelinekMercerSPUD.php | Class | Class source | ||
LemurTfIdf.php | Class | Class source | ||
ModBM25.php | Class | Class source | ||
PivotedConcaveTF.php | Class | Class source | ||
PivotedConcaveTFIDF.php | Class | Class source | ||
PivotedTfIdf.php | Class | Class source | ||
TermCount.php | Class | Class source | ||
TermFrequency.php | Class | Class source | ||
TfConcaveK.php | Class | Class source | ||
TfConcaveLog.php | Class | Class source | ||
TfIdf.php | Class | Class source | ||
TfRobertson.php | Class | Class source | ||
TwoStageLM.php | Class | Class source | ||
WeightedModel.php | Class | Class source | ||
XSqrAM.php | Class | Class source |
Files (200) | / | src | / | Basset | / | Models | / | Contracts |
File | Role | Description |
---|---|---|
IDFInterface.php | Class | Class source |
LanguageModelInterface.php | Class | Class source |
ProbabilisticModelInterface.php | Class | Class source |
TFInterface.php | Class | Class source |
WeightedModelInterface.php | Class | Class source |
Files (200) | / | src | / | Basset | / | Models | / | DFIModels |
File | Role | Description |
---|---|---|
ChiSquared.php | Class | Class source |
DFIInterface.php | Class | Class source |
DFIModel.php | Class | Class source |
Saturated.php | Class | Class source |
Standardized.php | Class | Class source |
Files (200) | / | src | / | Basset | / | Models | / | DFRAfterEffect |
File | Role | Description |
---|---|---|
AfterEffect.php | Class | Class source |
AfterEffectInterface.php | Class | Class source |
B.php | Class | Class source |
L.php | Class | Class source |
Files (200) | / | src | / | Basset | / | Models | / | DFRModels |
File | Role | Description |
---|---|---|
BasicModel.php | Class | Class source |
BasicModelInterface.php | Class | Class source |
BE.php | Class | Class source |
G.php | Class | Class source |
In.php | Class | Class source |
InExp.php | Class | Class source |
InFreq.php | Class | Class source |
P.php | Class | Class source |
Files (200) | / | src | / | Basset | / | Models | / | IBDistribution |
File | Role | Description |
---|---|---|
IBDistributionInterface.php | Class | Class source |
LLDistribution.php | Class | Class source |
SPLDistribution.php | Class | Class source |
Files (200) | / | src | / | Basset | / | Models | / | IBLambda |
File | Role | Description |
---|---|---|
IBLambdaInterface.php | Class | Class source |
Lambda.php | Class | Class source |
LambdaDF.php | Class | Class source |
LambdaTTF.php | Class | Class source |
Files (200) | / | src | / | Basset | / | Models | / | Normalization |
File | Role | Description |
---|---|---|
Normalization.php | Class | Class source |
NormalizationBM25.php | Class | Class source |
NormalizationDP.php | Class | Class source |
NormalizationF.php | Class | Class source |
NormalizationH1.php | Class | Class source |
NormalizationH2.php | Class | Class source |
NormalizationH2E.php | Class | Class source |
NormalizationInterface.php | Class | Class source |
NormalizationJMDF.php | Class | Class source |
NormalizationJMTF.php | Class | Class source |
NormalizationP.php | Class | Class source |
Files (200) | / | src | / | Basset | / | Normalizers |
File | Role | Description |
---|---|---|
English.php | Class | Class source |
Normalizer.php | Class | Class source |
NormalizerInterface.php | Class | Class source |
Files (200) | / | src | / | Basset | / | Results |
File | Role | Description |
---|---|---|
ResultEntry.php | Class | Class source |
ResultSet.php | Class | Class source |
Files (200) | / | src | / | Basset | / | Statistics |
File | Role | Description |
---|---|---|
CollectionStatistics.php | Class | Class source |
EntryStatistics.php | Class | Class source |
PostingStatistics.php | Class | Class source |
Files (200) | / | src | / | Basset | / | Stemmers |
File | Role | Description |
---|---|---|
RegexStemmer.php | Class | Class source |
Stemmer.php | Class | Class source |
StemmerInterface.php | Class | Class source |
Files (200) | / | src | / | Basset | / | Tokenizers |
File | Role | Description |
---|---|---|
TokenizerInterface.php | Class | Class source |
WhitespaceAndPunctuationTokenizer.php | Class | Class source |
WhitespaceTokenizer.php | Class | Class source |
Files (200) | / | src | / | Basset | / | Utils |
File | Role | Description |
---|---|---|
Serializer.php | Class | Class source |
StopWords.php | Class | Class source |
TransformationInterface.php | Class | Class source |
TransformationSet.php | Class | Class source |
Files (200) | / | tests |
File | Role | Description | ||
---|---|---|---|---|
Basset (3 directories) | ||||
bootstrap.php | Aux. | Auxiliary script | ||
phpunit.xml | Data | Auxiliary data | ||
sample.php | Class | Class source |
Files (200) | / | tests | / | Basset |
File | Role | Description | ||
---|---|---|---|---|
Documents (3 files) | ||||
Metric (25 files) | ||||
Tokenizers (3 files) |
Files (200) | / | tests | / | Basset | / | Documents |
File | Role | Description |
---|---|---|
BaseDocuments.php | Class | Class source |
DocumentsTest.php | Class | Class source |
TokensDocumentTest.php | Class | Class source |
Files (200) | / | tests | / | Basset | / | Metric |
Files (200) | / | tests | / | Basset | / | Tokenizers |
File | Role | Description |
---|---|---|
BaseTokenizers.php | Class | Class source |
WhitespaceAndPunct...onTokenizerTest.php | Class | Class source |
WhitespaceTokenizerTest.php | Class | Class source |
The PHP Classes site has supported package installation using the Composer tool since 2013, as you may verify by reading this instructions page. |
Install with Composer |
Version Control | Unique User Downloads | Download Rankings | |||||||||||||||
100% |
|
|
Applications that use this package |
If you know an application of this package, send a message to the author to add a link here.