The research data management terminology formerly stewarded by CASRAI is now managed by CODATA. The terminology is reviewed via a lightweight and pragmatic process on an annual basis, drawing on the expertise of a voluntary expert Working Group. The version of the terminology presented here was reviewed in 2022. The draft 2022 release was recently open for community feedback. The new finalised edition of the Terminology will soon be available online, linked from this page. For further information on the terminology or the review process, please contact the RDM Terminology Working Group convener: laura AT codata.org.
Scope of the terminology
The goal of this terminology is to gather the key terms needed for a common understanding of the research data management domain.
Research data management (RDM) refers to the storage, access and preservation of data created or collected in the course of research. Research data management practices cover the entire lifecycle of the data, from planning the investigation to conducting it, and from backing up data as it is created and used to long term preservation of data deliverables after the research investigation has concluded.
Definitions are intended to be clear and unambiguous, and where possible, fit with common usage. Definitions should be apposite across research data management activities of key stakeholders, including those working on research data management within the context of research, data management, digital curation and preservation, research management, research policy, open data advocacy, computer science, information management, research administration, library, scholarly publishing, digital archiving and research funding roles. Some terms may have more than one definition, in which case the relevant context should be specified.
If a consensus definition can be easily found elsewhere, the term is out of scope. This terminology is limited to the specific concepts necessary for a common understanding of RDM.
This scope statement has been revised and approved by the 2021-22 RDM Terminology WG.
Browse the previous terminology
NB: This version of the terminology was available up to and including 2021. Significant changes were made in 2022 as a result of the 2021-22 review cycle and will shortly be reflected here
- Access
- Access control list
- Access controls
- Accession number
- Access workflow
- Active archive
- Active data
- Ad hoc
- Ad hoc testing
- Administrative data
- Administrative metadata
- Aggregated data
- Aggregation
- Algorithm
- Analogue data
- Analogue materials
- Analogue signals
- Analytical quality control
- Analytics
- Anomaly
- Anonymity
- Application Vulnerability Description Language
- Applied science
- Architecture
- Archive
- Archiving
- Archivist
- At-risk data
- Audit
- Authentication
- Authenticity metadata
- Behavioural competencies
- Best practice
- Big data
- Bit Sequence
- Bit Stream
- Black box
- Blueprint
- Born digital
- Boundary value
- Bug
- Canonical data collection
- Catalogue
- Cataloguing
- Causation
- Certified product
- Change log
- Change management
- Checksum
- Chief Data Officer
- Chief Digital Officer
- Chief Information Officer
- Chief Technology Officer
- Citable data
- Client
- Cloud computing
- Cloud ecosystem
- Collection management identification
- Comma separated values
- Commit
- Compliance
- Component
- Compute intensive
- Computer code
- Computer intensive
- Computer systems
- Concept
- Confidential information
- Confidentiality
- Conformance
- Consensus standard
- Consumer data
- Container
- Content information
- Content replication
- Controlled vocabulary
- Corpus
- Correlation
- Corrupt data
- Creativity
- Cross-disciplinary
- Curation
- Curation workflow
- Dark data
- Darwin information typing architecture
- Data
- Data access protocol
- Data acquisition
- Data archive
- Data availability
- Database
- Database administration
- Data capture
- Data catalogue
- Data centre
- Data citation
- Data cleaning
- Data collection
- Data completeness
- Data compliance
- Data container
- Data curation
- Data custodian
- Data de-noising
- Data destruction
- Data dictionary
- Data dredging
- Data driven decision management
- Data driven disaster
- Data element
- Data entity
- Data exploration
- Data file format
- Data governance
- Data harmonization
- Data hygiene
- Data identifier
- Data ingestion
- Data integration
- Data integrity
- Data item
- Data librarian
- Data lifecycle
- Data linkage
- Data management
- Data management infrastructure
- Data management plan
- Data management policy
- Data mart
- Data migration
- Data mining
- Data model
- Data modeling
- Data munging
- Data organization
- Data policy
- Data preprocessing
- Data processing
- Data production
- Data profiling
- Data publication
- Data quality
- Data recovery
- Data reduction
- Data reference model
- Data registration
- Data repository management
- Data representation
- Data rescue
- Data residency
- Data retention policy
- Data review
- Data sampling
- Data scaling
- Data selection
- Dataset series
- Data sharing
- Data splitting
- Data standardization
- Data steward
- Data store
- Data stream
- Data structure
- Data structure continuum
- Data table attribute
- Data tension
- Data traceability
- Data transformation
- Data type registry
- Data upload database
- Data validation
- Data warehouse
- Data wrangling
- Data z-score scaling
- Datetime
- De-anonymization
- Deep archive
- De facto standard
- Defect
- De-identification
- Demilitarized zone
- Denormalization
- Derived data product
- Descriptive metadata
- Digital
- Digital archiving
- Digital data
- Digital infrastructure
- Digital materials
- Digital object
- Digital Object Identifier
- Digital preservation
- Digital research data
- Digital scholarship
- Digital signals
- Digitisation
- Dirty data
- Dissambuation
- Documented data
- Document type definition
- Dublin Core
- Dynamic data
- Ecosystem
- Electronic health record
- Electronic medical record
- Encoding schema
- Engineering and scientific support
- Enhancement
- E-Research
- E-Research infrastructure
- Error
- Error seeding
- E-Science
- Evaluation
- Executive
- EXtensible Markup Language
- Extensible resource identifier
- Extract-Transform-Load
- Failure
- Fair use
- Feature extraction
- Field
- Firefighting
- Fixed data
- Foundational interoperability
- Framework
- Golden record
- Governance
- Governance and accountability model
- Gremlin
- Grid
- Hashing
- Health science
- Heat map
- High quality data
- Human-readable format
- Hypermedia As The Engine Of application State
- Identity ecosystem
- Impact
- Import
- Incumbent-based
- Indeterminate employment
- Informaticist
- Information
- Information management advisor
- Information management specialist
- Information silos
- Information technology specialist
- Innovation
- Input
- Instrument
- Instrument output data
- Integrated access management
- Integration
- Integrity
- Intellectual leadership
- Inter-disciplinary
- Interface testing
- International chemical identifier
- International standard
- International Standards Organization
- Interoperability
- Investigation
- ISO 8000
- ISO 9000
- ISO 17025
- ISO 19115 Metadata profile
- Key stakeholder
- Knowledge
- Laboratory manager
- Laboratory supervisor
- Laboratory technician
- Laboratory technologist
- Landscape
- Legacy data
- Linked open data
- Long-term preservation
- Machine readable
- Machine-readable format
- Manage datasets in a repository
- Manage metadata catalog
- Manager
- Managing research
- Mandatory standard
- Mashup
- Masking
- Meaningful use
- Medium-term preservation
- Meets requirements
- Message privacy
- Metadata
- Metadata catalogue
- Metadata dataset
- Metadata record
- Middleware
- Migration
- Minimal metadata
- Missing Data
- Moof monster
- Murphy's Law
- Namespace
- National standard
- Negative testing
- Noisy data
- Non identifiable data
- Non personally identifiable information
- Normalization
- OAI repository
- Object attribute
- Object model
- Object property
- Open Archives Initiative Protocol for Metadata Harvesting
- Open data
- Open government
- Operational management
- Organizational leadership
- Original repository
- Peer review
- Persistent identifier
- Persistent uniform resource locator
- Personal information privacy
- Personally identifiable information
- Physical science
- PID attribute
- PID domain
- PID record
- PID resolution
- PID service
- PID system
- Pipe separated values
- Preprint
- Preservation
- Preservation metadata
- Pretty Good Privacy fingerprint
- Pretty Good Privacy ID
- Principal Investigator
- Privacy
- Privacy governance
- Privacy-preserving data linkage
- Process
- Productivity
- Professional standard
- Program
- Program governance
- Program manager
- Project
- Project lifecycle
- Project management lifecycle
- Project manager
- Project quality control
- Project team member
- Proportionate governance
- Proprietary
- Protocols
- Provenance
- Provenance metadata
- Quality assurance
- Quality control
- Raw data
- Real-time data
- Recognition
- Record
- Record provenance information
- Records retention schedule
- Record standardization
- Redundancy
- Referable data
- Reference model
- Reference resolution
- Reformatting
- Refreshing
- Registered data
- Registry
- Related scientific activities
- Relational database
- Relations
- Reliability
- Remote access
- Remote data access
- Repeatable process
- Replica number
- Replication
- Repository
- Repository access
- Representation
- Representation and client services
- Representation object
- Reproducible research
- Repurposed data
- Requirements
- Requirements analysis
- Requirements creep
- Requirements stability index
- Research
- Research and development
- Research context
- Research data
- Research data format
- Research data management
- Research data management infrastructure
- Research data publication workflow
- Research, development and analysis
- Researcher level
- Researcher promotion documentation
- Research governance
- Research manager
- Research metadata format
- Research results
- Research scientist
- Resistance management
- Resource
- Resource authorization
- Responsibility
- Result
- Retention period
- Re-use
- Revision control system
- Robustness
- Role
- Schema
- Science
- Science and technology data
- Scientific data infrastructure
- Scientific data services
- Scientific method
- Scientific workflow
- Scientist
- Semantic data
- Semantic interoperability
- Semi-structured data
- Service object
- Services
- Short-term preservation
- Silver bullet
- SMART
- Specialty
- Stakeholder
- Standard
- Standardization
- Standard Operating Procedure
- Standard Operating Procedure for the collection of harmonized or integrated data
- Statistical de-identification
- Steering committee
- Sticky bits
- Storage location
- Strategy
- Structural metadata
- Structured data
- Support service
- SWOT
- Syntactic interoperability
- System
- System metadata
- Table
- Tab Separated Values
- Technical metadata
- Technique
- Technology
- Temporary version
- Text file
- Tool
- Topical metadata
- Total Quality Management
- Transdisciplinary
- Trusted Digital Repository
- Unified data management platform
- Uniform resource identifier
- Uniform resource namespace
- Universally Unique Identifier
- Universal Numeric Fingerprint
- University teaching
- Unstructured data
- Usable data
- Use case
- Use metadata
- User acceptance testing
- Valued outcome
- Verify checksum
- Version control
- View
- Voluntary standard
- Web resource