Abstract

A system and method translate clinical-trial protocol text into structured electronic-data-capture (EDC) edit-check rules (each rule expressed as field / operator / value / action with a severity, an action-on-violation, a confidence score, a human-readable rationale, and a provenance reference to the originating protocol span). Untrusted protocol text is hardened against prompt injection before any model is invoked, via length capping, control-character removal, a configurable set of injection-pattern filters, and role-marker stripping. Rule generation is a hybrid of (a) a deterministic knowledge-base pattern matcher and (b) an optional large-language-model (LLM) augmentation pass; generated rules are bound to a CRF (case report form) schema, de-duplicated against existing rules, confidence-scored, and optionally re-prompted when schema validation fails. The disclosure covers the integrated pipeline and a broad range of alternative embodiments.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS