Gartner mentions Altilia in its 1st market guide on Intelligent Document Processing

How Machine Learning works – and what it means for your organization

By altilia on May 8, 2023

In our second blog of this series, where we unlock the lexicon of Artificial Intelligence for business leaders currently being overwhelmed by the hype of ChatGPT, we will focus on Machine Learning (ML).

What is Machine Learning?

People throw the terms machine learning and AI together and interchangeably, but they don’t mean the same thing. ML is a subset of AI that uses computers to learn or improve performance based on the data they use.

It’s a fascinating concept, straight out of science fiction: a computer uses algorithms to learn from the data provided. The more it develops, the more it learns: the more data it is fed, the better it gets.

It is where the concerns come that computers can become “more intelligent” than their human masters.

The reason ML has become more successful and prominent in the past decade, is the growth in volume, variety and quality of both public and privately-owned data, the availability of cheaper and more powerful data processing and storage capabilities.

Essentially ML models look for patterns in data and draw conclusions, which is then applied to new sets of data. They are not explicitly directed by people, as the machine learning capabilities develop from the data provided, particularly with large data sets. The more data used, the better the results will be.

So, where AI is the umbrella concept of enabling a machine to sense, reason or act like a human, ML is an AI application that allows computers to extract knowledge from data and learn from it autonomously.

How to train ML models

The key to machine learning (as much else in life) is training. ML computers need to be trained with new data and algorithms to obtain results.

Three training models are used in machine learning:

  • Supervised learning maps in a specific input to an output using labelled/structured training data. Simply, to train the algorithm to recognize pictures of cats, it feeds it labelled pictures of cats.
  • Unsupervised learning is based on unstructured (unlabelled) data, so that the end result is not known in advance. This is good for pattern matching and descriptive modelling. For example, Altilia uses Large Language Models (LLMs) as its foundation, which are trained on huge datasets using unsupervised learning.
  • Reinforcement learning can be described as “learn by doing”. An “agent” learns to perform a task by feedback loop trial and error until it performs within the desired range, receiving positive and negative reinforcement depending on its success. Altilia often uses Human-in-the-Loop (HITL) reinforced learning in its Altilia Review module.
  • Transfer learning enables data scientists to benefit from knowledge gained from a previous model for a similar task, in the same way that humans can transfer their knowledge on one topic to a similar one. It can shorten ML training time and rely on fewer data points. Altilia uses this technique to fine-tune pre-trained Large Language Models (LLMs) on a dataset provided by the client. We will focus on LLMs in a future blog.

Why not schedule a demo with Altilia to learn more about how we can help transform your organization? Click here to register. 

By altilia on May 8, 2023

Explore more stories like this one

Leveraging GPT and Large Language Models to enhance Intelligent Document Processing

The rise of Artificial Intelligence has been the talk of the business world since the emergence of ChatGPT earlier this year. Now executives around the world find themselves in need of understanding the importance and power of Large Language Models in delivering potentially ground-breaking use cases that can bring greater efficiency and accuracy to mundane tasks. Natural Language Generation (NLG) enables computers to write a human language text response based on human generated prompts. What few understand is that there is still a deep flaw in the ChatGPT technology: up to 20-30% of all results have inaccuracies, according to Gartner. What Gartner have found is that ChatGPT is “susceptible to hallucinations and sometimes provides incorrect answers to prompts. It also reflects the deficiencies of its training corpus, which can lead to biased or inappropriate responses as well as algorithmic bias.” To better understand this, it’s key to consider how LLMs work: hundreds of billions of pieces of training data are fed into the model, enabling it to learn patterns, associations, and linguistic structures. This massive amount of data allows the model to capture a wide range of language patterns and generate responses based on its learned knowledge. However, as vast training data can be, the model can only generate responses as reliable as the information it has been exposed to. If it encounters a question or topic that falls outside the training data or knowledge cutoff, responses may be incomplete or inaccurate. For this reason, and to better understand how best to use LLMs in enterprise environments, Gartner outlined a set of AI Design Patterns and ranked them by difficulty of each implementation. We are delighted to share that Altilia Intelligent Automation already implements in its platform two of the most complex design patterns: LLM with Document Retrieval or Search This provides the potential to link LLMs with internal document databases, unlocking key insights from internal data with LLM capabilities This provides much more accurate and relevant information, reducing the potential for inaccuracies due to the ability to the use of retrieval. Fine-tuning LLM The LLM foundation model is fine-tuned using transfer learning with an enterprise’s own documents or particular training dataset, which updates the underlying LLM parameters. LLMs can then be customized to specific use cases, providing bespoke results and improved accuracy. So, while the business and technology world has been getting excited by the emergence of ChatGPT and LLMs, Altilia has already been providing tools to enterprises to leverage these generative AI models to their full potential. And by doing so, thanks to its model’s fine-tuning capabilities, we are able to overcome the main limitation of a system like OpenAI’s ChatGPT, which is the lack of accuracy of its answers. For more information on how Altilia Intelligent Automation can help your organization, schedule a free demo here.

Read more

How to use AI to discover the hidden meaning in complex documents

Welcome to our third blog of a series uncovering the key components of Artificial Intelligence to provide greater understanding for business leaders who may currently have FOMO (Fear Of Missing Out) from the blizzard of acronyms and hype. Here, we look at Computer Vision, one of the main applications of AI where computers can be made to gain high-level of understanding from digital images or videos. Critically, Computer Vision is concerned with automatic extraction of data, enabling documents that have handwriting and random layouts to become machine-readable. Huge data volumes Computer Vision needs a lot of data to be able to distinguish and recognize images. In a way, it looks like a jigsaw puzzle where you assemble all the scattered tiles to make an image. Neural networks for CV work on the same principle. Yet the computer does not have the final image, but it is fed hundreds, if not thousands of related images that train it to recognize specific objects. To identify a cat, the computer would not be shown individual elements such as ears, whiskers, tail etc, but millions of pictures of cats so that it can model the features of our feline friends. CV is used for visual surveillance, medical image processing for patient diagnosis and navigation by autonomous vehicles. But in Altilia’s development of Intelligent Document Processing (IDP), CV has several key roles to play. With Optical Character Recognition (OCR) and Intelligent Character Recognition (ICR), we are able to convert scanned documents into machine-readable PDFs and with Handwritten Text Recognition (HTR) are incorporate items such as signatures. End goal The end goal of an IDP solution is to extract meaningful information that are “hidden” in unstructured texts and documents, so we need to first break words down in a way that a machine can understand. This is especially relevant when the documents that need to be processed are (low quality) scans such as contracts, forms, invoices or ID cards. We then need to apply OCR to recognize both printed and handwritten text, using smaller units called tokens. To each token is added metadata, which is useful later in a search engine. In IDP, it is useful to distinguish a photo from text and to tag elements such as signatures, stamps and markings, saving human labor time by automating checks such as whether a contract is signed and marked. Finally, we focus on document layout analysis so that unsorted documents can be classified and then we can apply different machine learning algorithms and branch out different ML pipelines. These core capabilities allow Altilia’s solution to work as a general purpose platform, rather than a point solution for specific document types and formats. We have also developed a patented solution for document layout analysis. For more information on how Altilia Intelligent Automation can help your organization, schedule a free demo here.

Read more

The lexicon of Artificial Intelligence - and how it can transform your organization

The emergence of ChatGPT in recent months has prompted an unprecedented explosion of public hype and interest in Artificial Intelligence and its potential future uses. Following industry luminaries such as Bill Gates describing AI as the most important technological advance in decades, there has been a torrent of predictions on how it will change the world of work. Here at Altilia we have been building an AI-based platform, Altilia Intelligent Automation, that is at the forefront of a new revolution based around Intelligent Document Processing (IDP). It is clear, however, that industry leaders anxious to understand this new phenomenon, are overwhelmed by the language, acronyms and blizzard of nascent technologies which leave them baffled and unsure on how to get started. Our aim over a series of blogs is to unpack the lexicon of AI and explain what it all means, a useful, insightful and simplified guide for organizations interested in how AI will affect them. What is Artificial Intelligence? The term artificial intelligence (AI) is used loosely to refer to applications that perform complex tasks that previously required human intervention, such as communicating with customers online or playing chess. The term is often used interchangeably with the terms machine learning (ML) and deep learning, although they (as we will discover) have different meanings. Many companies are investing significantly in data science teams to take full advantage of AI, combining statistics, computer science and business knowledge to extract value from various data sources and enable problem-solving. Algorithms seek to create expert systems which make predictions or classifications based on input data. Advanced functions can include the ability to see, understand and translate spoken and written language, analyze data, make recommendations and classify both structured and unstructured data. Computers and machines are developed with the ability to reason, learn and act in a way that would normally require human intelligence, often at a scale that exceeds or speeds up what humans can deal with. Benefits of AI: Automation. AI can automate workflows and processes or work independently and autonomously from a human team. Reduce human error. AI can eliminate manual errors in data processing, analytics, assembly in manufacturing, and other tasks through automation and algorithms that follow the same processes every single time. Eliminate repetitive tasks. AI can be used to perform repetitive tasks, freeing human capital to work on higher impact problems. Fast and accurate. AI can process more information more quickly than a human, finding patterns and discovering relationships in data that a human may miss. Accelerated research and development. The ability to analyze vast amounts of data quickly can lead to accelerated breakthroughs in research and development. Altilia are world experts in Intelligent Document Processing (IDP), so let’s have a look at how it builds on AI capabilities and developments to enhance organizational efficiency and accuracy. What is Intelligent Document Processing? At its most simplistic level, Intelligent Document Processing (IDP) converts unstructured and semi-structured data into structured usable information, thus enabling layers of automation to document-centric business processes. As an example, many mortgage forms may be filled in with (unstructured) hand-written answers by an applicant, which would need human intervention to input that information into a financial services company’s systems. IDP uses AI (and other technologies) to extract that information in a usable form, thus reducing time and manual labor. Which AI fields are relevant for IDP? A human reading and understanding a document needs to: Have visual perception to recognize images, symbols and writing Have a comprehensive understanding of the document’s language Be able to understand new information and learn concepts Be able to memorize concepts and the relations between them. AI, therefore, needs to emulate these abilities to be capable of reading like a human – and for IDP, these are the most relevant application fields: Computer Vision is the AI field that emulates human visual perception Natural Language Processing (NLP) focuses on developing algorithms for general understanding of a language Machine Learning (ML) is focused on training a machine to learn and conceptualise information as models Knowledge Representation is focused on representing information in a way that is functional, not just to memorize the content, but relations between them. We will continue this series explaining the lexicon of AI in the coming weeks. For more information on how Altilia can support your business, schedule a demo here .

Read more