Demystify AI terminology with our comprehensive glossary. Understand the technology powering intelligent document processing.
Start with these fundamental concepts
The simulation of human intelligence in machines programmed to think and learn like humans. AI systems can perform tasks that typically require human intelligence, such as visual perception, speech recognition, decision-making, and language translation.
A subset of AI that enables systems to learn and improve from experience without being explicitly programmed. ML algorithms build mathematical models based on training data to make predictions or decisions.
A branch of AI that helps computers understand, interpret, and manipulate human language. NLP bridges the gap between human communication and computer understanding.
A field of AI that trains computers to interpret and understand visual information from the world, including images and videos.
AI technologies specifically designed to understand, process, and extract information from documents in various formats.
The use of AI technologies to capture, extract, and process data from various document types with minimal human intervention.
Technology that converts different types of documents, such as scanned paper documents or PDF files, into editable and searchable data.
The process of retrieving specific data fields from documents, often using AI to understand context and meaning.
The combination of AI technologies with automation to create systems that can learn, adapt, and make decisions.
The percentage of correct predictions or extractions made by an AI model.
An ML approach where the model identifies which data it needs to learn from most.
A step-by-step procedure or formula for solving a problem. In ML, algorithms are the methods used to train models from data.
Identifying patterns in data that do not conform to expected behavior.
A set of protocols and tools that allows different software applications to communicate and share data.
The simulation of human intelligence in machines programmed to think and learn like humans. AI systems can perform tasks that typically require human intelligence, such as visual perception, speech recognition, decision-making, and language translation.
Processing large volumes of data in groups rather than individually.
A score indicating how certain the AI model is about its classification decision.
Automation that uses AI to handle tasks requiring judgment, perception, and decision-making.
A field of AI that trains computers to interpret and understand visual information from the world, including images and videos.
Modern OCR that uses computer vision techniques for better accuracy and understanding.
A class of deep neural networks most commonly applied to analyzing visual imagery.
The process of retrieving specific data fields from documents, often using AI to understand context and meaning.
An ML technique based on artificial neural networks with multiple layers. Deep learning models can automatically learn hierarchical representations of data, making them particularly effective for complex tasks.
AI technologies specifically designed to understand, process, and extract information from documents in various formats.
The process of automatically categorizing documents into predefined classes based on their content and structure.
The ability of AI systems to comprehend document structure, layout, and meaning beyond simple text extraction.
Processing data near the source of data generation rather than in a centralized cloud.
Dense vector representations of text that capture semantic meaning. Words or documents with similar meanings have similar embeddings.
The process of identifying and extracting specific entities like names, dates, amounts, and addresses from text.
The process of managing documents that cannot be fully automated and require human intervention.
The harmonic mean of precision and recall, providing a single score for model performance.
Adjusting a pre-trained model to work better on specific data or tasks.
AI capability to identify and extract data from structured forms, including both filled and blank form templates.
Organizing documents into a tree-like category structure with parent-child relationships.
AI systems that incorporate human feedback to improve performance and handle exceptions.
Techniques applied to document images before OCR to improve recognition accuracy.
The ability of AI to identify objects, places, people, writing, and actions in images.
The process of partitioning an image into multiple segments or regions to simplify analysis.
The process of using a trained model to make predictions on new data. Also called prediction or scoring.
The combination of AI technologies with automation to create systems that can learn, adapt, and make decisions.
Advanced form of OCR that can recognize and convert handwritten text into machine-readable format.
The use of AI technologies to capture, extract, and process data from various document types with minimal human intervention.
Identifying and extracting pairs of labels and their corresponding values from documents.
The process of understanding the physical structure and organization of elements within a document.
A subset of AI that enables systems to learn and improve from experience without being explicitly programmed. ML algorithms build mathematical models based on training data to make predictions or decisions.
An architectural style where applications are built as a collection of small, independent services.
A mathematical representation learned from data that can make predictions or decisions. Models are the output of machine learning algorithms.
Classification where documents can belong to one of multiple predefined categories.
An NLP technique that identifies and classifies named entities (people, places, organizations, dates, etc.) in text.
A branch of AI that helps computers understand, interpret, and manipulate human language. NLP bridges the gap between human communication and computer understanding.
Computing systems inspired by biological neural networks in animal brains. They consist of interconnected nodes (neurons) that process information using connectionist approaches.
A computer vision technique that identifies and locates objects within images or videos.
A metric indicating how certain the OCR system is about its character recognition results.
Technology that converts different types of documents, such as scanned paper documents or PDF files, into editable and searchable data.
The percentage of positive predictions that were actually correct.
Processing data immediately as it arrives, with minimal latency.
The percentage of actual positive cases that were correctly identified.
Technology that uses software robots to automate repetitive, rule-based tasks typically performed by humans.
The ability of a system to handle increased load by adding resources.
Search that understands the intent and contextual meaning of search queries.
The process of determining the emotional tone or opinion expressed in text. Used to understand attitudes, opinions, and emotions.
The ability to process documents from input to completion without human intervention.
An ML approach where models are trained on labeled data. The algorithm learns from input-output pairs and can make predictions on new, unseen data.
The process of identifying tables in documents and extracting their structured data while preserving relationships.
The process of assigning predefined categories to text documents based on their content.
The process of locating regions in an image that contain text before performing recognition.
The process of breaking text into smaller units (tokens), such as words, phrases, or sentences, for analysis.
The dataset used to train machine learning models. Quality and quantity of training data directly impact model performance.
Using a pre-trained model on a new but related task, leveraging previously learned knowledge.
A neural network architecture that uses self-attention mechanisms. Forms the basis of modern NLP models like GPT and BERT.
ML technique where models find patterns in data without labeled examples. The algorithm discovers hidden structures in unlabeled data.
Explore additional resources and industry research to deepen your understanding of AI technologies in document processing.
See these AI technologies in action with Ademero's intelligent document processing
99.5% accuracy on documents
Self-learning automation
No templates required