Requirement | Importance | Comments |
---|
STIGsSTIG Pipeline |
Scrape the STIG document library to download all zip files. | High | The zip files are multi-level nested zip files. |
Unzip each STIG file to retrieve the XML files. | High | |
Store the XML files for later use. | High | The hierarchy of zip files must be maintained to ensure follow-on functions have context. |
Identify which documents within the hierarchy are Authority Documents. | High | The zip files may contain readme’s or other files that do not constitute Authority Documents. |
Identify which files are Glossary-specific. | High | Some files may solely be Glossaries with term-definition pair entries. Ensure those documents are also processed for follow-on steps. |
Detect file metadata changes from prior processing. | High | Potential metadata changes could be a new document or a new version of an old document. |
Only pass new or changed documents further down the pipeline. | High | |
NIST 800-53 Pipeline |
Access the GitHub repository for all NIST 800-53 content. | High | |
Retrieve and store the XML, JSON, and YAML files for later use. | High | |
Identify which documents are Authority Documents. | High | |
Identify which files are Glossary-specific. | High | Some files may solely be Glossaries with term-definition pair entries. Ensure those documents are also processed for follow-on steps. |
Detect file metadata changes from prior processing. | High | |
Only pass new or changed documents further down the pipeline. | High | |
FedRAMP Data Pipeline |
Access the GitHub repository for all FedRAMP content. | | |
Retrieve and store the XML, JSON, and YAML files for later use. | | |
Identify which documents are Authority Documents. | | |
Identify which files are Glossary-specific. | High | Some files may solely be Glossaries with term-definition pair entries. Ensure those documents are also processed for follow-on steps. |
Detect file metadata changes from prior processing. | | |
Only pass new or changed documents further down the pipeline. | | |
eCFR Data Pipeline |
Access the eCFR files via the eCFR APIs. | | |
Store files for later use. | | |
Identify which documents are Authority Documents. | | |
Identify which files are Glossary-specific. | High | Some files may solely be Glossaries with term-definition pair entries. Ensure those documents are also processed for follow-on steps. |
Detect file metadata changes from prior processing. | | |
Only pass new or changed documents further down the pipeline. | | |
General Pipeline |
Catalog each source Authority Document. | High | Gather all information as is required by the Common Data format. |
Identify and extract Citations from the Authority Document | High | Citations are passages in the Authority Document that: contain Mandates (requirements) OR related contextual information such as stubs, informational, and informational gathering.
|
Maintain Citation structure. | High | Since Authority Documents will contain multiple Citations and passages may have related Citations, that structure must be maintained to know the relationship between Citations. |
Extract Glossary from within Authority Documents. | High | Some Authority Documents may have glossary within the document. This will typically be near the end of the file. Extract the Glossary details including the Title, source, and all term-definition pairs. |
Extract Glossary from glossary-specific files. | High | Some files may only have Glossary entries with term definition pairs. Extract the Glossary details including the Title, source, and all term-definition pairs. |
Detech content changes from prior loads. | High | Nee discussion here. See questions |
Transform the Authority Document into the Common Data Format | High | The transformation documentation must be used as reference as to how source document schema structures are related to the Common Data Format. |
Transform the Authority Document related Citations into the Common Data Format | High | As above, use the CDF transformation document as reference. |
Transform the Glossaries into the Common Data Format | High | As above, use the CDF transformation document as reference. |
Load Authority Documents into the Unified Compliance Platform | High | UCF engineering team will determine the optimal approach for loading (API, service, …) |
Load Citations into the Unified Compliance Platform | High | Same as above |
Load Glossaries into the Unified Compliance Platform | High | Same as above |
| | |