Data Rights
Contract Number: N6833522C0500
Contractor Name: P.W. Communications, Inc.
Contractor Address: 11200 Rockville Pike Suite 130 Rockville, MD 20852
Expiration of Data Rights Period: January 16, 2029
The Government’s rights to use, modify, reproduce, release, perform, display, or disclose technical data or computer software marked with this legend are restricted during the period shown as provided in paragraph (b)(4) of the Rights in Noncommercial Technical Data and Computer Software Small Business Innovation Research (SBIR) Program clause contained in the above identified contract. No restrictions apply after the expiration date shown above. Any reproduction of technical data, computer software, or portions thereof marked with this legend must also reproduce the markings.
Updates
Implemented method to collapse outlier topic into real topics
Tested multiple embedding models
Implemented multiple topic representations
Distinct Text Level Topics
Distinct Award Level
YAKE
Implemented another keyword extraction tool that is significantly faster and lower compute than KeyBERT.
The Yake and KeyBERT extracted keywords absolutely have analysis value and they might also help guide slightly better BERTopic labels if we could seed the model with pre-defined labels using these keywords or other conceptual award types for FPDS Phase IIIs from subject matter experts.
Collocations
The BERTopic/YAKE process helped elucidate that in certain cases one can infer or in certain cases know with reasonable certainty the provenance and origin of the FPDS Phase III award. Collocation analysis combined with some of the odd BERTopic labels led me to this finding.
Full FPDS Phase III Collocations: Bigrams, Trigrams and Quadgrams
Methodology, First Guess Targeted Collocations to Find Provenance
This was just a best guess of possible phrases and regex patterns that might find blocks of texts where we can figure out how this FPDS Phase III came to be. A more exhaustive corpus or search terms and discussions on patterns in topic codes and contract id references would significantly improve the quality of this method.
previous sbir
previous sttr
previous sb
prior sbir
prior sttr
topic af
topic fa
transition
af[0-9]
fa[0-9]
open topic
phase i
phase ii