Picture source@visualChina
Text | Wang Jiwei
From "human RPA" to "human-generated AI RPA", how does LLM affect RPA human-computer interaction?
From another perspective, how does LLM affect RPA from the perspective of human-computer interaction?
RPA, which affects human-computer interaction in program development and process automation, will now also be changed by LLM?
How does LLM affect human-computer interaction? How does generative AI change RPA human-computer interaction? One article to understand:
If you ask what contribution RPA has to program development and automation, one of the answers is that it changes human-computer interaction (HCI, human-computer interaction).
In traditional workflow automation tools, software developers have to generate a list of actions, automate tasks and interface with back-end systems using internal application programming interfaces (APIs) or specialized scripting languages.
RPA systems develop action lists by observing users perform that task in the application's graphical user interface (GUI), then perform automation by repeating these tasks directly in the GUI, and can provide automation across multiple applications. process data.
This seemingly simple form, called a "plug-in", effectively reduces the barriers to using automation in products and further makes end-to-end automation possible for more organizations.
As a business process automation technology that changes the way digital workers work, it has not only liberated human resources from simple and repetitive work for more than 20 years, but also made program development easier. At the same time, it also creates a "human RPA" human-computer interaction model, allowing organizations to more easily achieve human-computer collaboration.
Especially after the emergence of sufficiently mature, flexible, scalable and reliable RPA platforms in recent years, many large organizations can use RPA to improve and optimize their business processes and development models to achieve efficiency gains and cost reductions.
The realization of the above is due to RPA's continuous improvement and optimization of business process automation and human-computer interaction in program development.
Yes, RPA, which has been supported by many technologies, is constantly penetrating into more industries and is also continuing to change human-computer interaction in various business scenarios in different fields.
Especially in recent years, RPA has become popular again precisely because it deeply integrates AI technology. The collection of hyper-automation technologies with RPA as the core includes all automation-related technologies, allowing the end-to-end automated human-computer interaction experience to continue to enhance, and thus be favored by more organizations.
Now, the era of large AI models has arrived, and the evolving RPA is also integrating generative AI technology. The current RPA that integrates LLM (Large Language Model, large language model) can be said to be a huge progress in human-computer interaction, and even a subversion of the previous RPA model.
Since we want to talk about the impact of LLM on RPA human-computer interaction, we naturally have to start with human-computer interaction. What impact does LLM have on human-computer interaction? How does RPA improve human-computer interaction? What impact does LLM have on RPA?
In this article, Wang Jiwei Channel will talk to you about these.
Let’s talk about human-computer interaction
In the 1970s, most offices still used metal files, typewriters and large amounts of paper to run their business. And bulky computers can only be stored in cold rooms that only a few people can operate.
In order to solve these problems, some companies began to develop personal computers. Xerox developed the Xerox Alto in 1973. Although the product was never launched due to high cost and other issues, it became the first sketch of a GUI and the source of inspiration for Macintosh and Windows.
Influenced by a series of studies and corresponding R&D, as well as the strong market demand for small computers at that time, as a means of studying how and why to make computers more user-friendly, the concept of human-computer interaction and a new discipline emerged in Late 1970s and early 1980s.
Since then, the field of HCI has continued to develop, mainly used to dissect human behavior to solve society's most complex problems, to study how people interact with computers and to what extent users are able to interact with computers, with the goal of improving the relationship between computers and users. engage in successful interactions with each other and explore which areas require more relevant development.
Due to its ability to solve the acute contradictions in social productivity at that time, the research field of HCI expanded to all IT fields in a short period of time.
At the same time researchers realized that they had to extend interaction with computers to everyone, not just information technology professionals. As a result, within a few years, HCI rapidly expanded to include nearly all changes in information technology design.
Thanks to the efforts of Steve Jobs and others, Apple launched the Macintosh personal computer in 1984, which completely changed the form of human-computer interaction. It made computer use easier, communication simpler, and keyboard, mouse, and icon-based user interfaces became popular.
Later, Apple became the pioneer of personal PCs, and Microsoft launched the Windows system. These products and software completely changed and subverted global business processes and office human-computer interaction forms.
Everyone is familiar with these, so there is no need to introduce them here.
To this day, IoT has become the basis of network connectivity, artificial intelligence has become ubiquitous, and human-computer interaction is still the focus of various technologies, products and solutions.
Through the previous brief history of development, I believe everyone should already have a general understanding of human-computer interaction. So what exactly is human-computer interaction? Let’s look at the next section.
Four elements, six goals and seven principles of human-computer interaction
The general definition believes that human-computer interaction techniques (Human-Computer Interaction Techniques) refer to the technology that realizes dialogue between humans and computers in an effective way through computer input and output devices.
Human-computer interaction technology includes machines providing a large amount of relevant information and prompts for instructions through output or display devices, and people input relevant information to the machine through input devices, answer questions and prompt for instructions, etc. Therefore, human-computer interaction technology is one of the important contents in computer user interface design.
Academically, human-computer interaction is a discipline concerned with the design, evaluation, and implementation of interactive computing systems for human use, and the study of the major phenomena surrounding them.
Human-computer interaction focuses on the interface (interaction interface) between people (users) and computers, and on the design and use of computer technology. Human-computer interaction covers many disciplines, including computer science, psychology, sociology, graphic design, industrial design, etc. It is a very comprehensive modern science.
Wikipedia believes that the interface between humans and computers is critical to facilitating this interaction. Desktop applications, Internet browsers, handheld computers, and more utilize today's popular GUIs. Speech recognition and synthesis systems utilize voice user interfaces (VUI).
Emerging multimodal and graphical user interfaces allow people to interact with specific characters and agents in ways that other interfaces cannot.
So, the development of the field of human-computer interaction has led to the improvement of interaction quality and led to many new research areas. Instead of designing conventional interfaces, different branches of research focus on the concept of multimodality instead of unimodality, smart adaptive interfaces instead of command/operation based interfaces, and active interfaces instead of passive interfaces.
From the name of human-computer interaction, we can deduce that it consists of three parts, namely the user, the computer itself and the way they work together.
Later, these three parts were expanded into four basic elements, namely users, tasks, tools/interfaces and background.
At the same time, HCI has six goals, namely efficient use (efficiency), safe use (security), good utility (practicability), easy to learn (learnability) and easy to remember how to use (memorability) .
On this basis, the seven design principles of HCI are also derived, as follows:
Principle 1: Fair use;
Principle 2: Flexible use;
Principle 3: Simple and intuitive use;
Principle 4: Perceptible information;
Principle 5: Fault tolerance;
Principle 6: Low physical labor;
Principle 7: Approach and use size and space.
In specific applications, the Internet of Things, eye tracking technology, speech recognition technology, the use of AR/VR and cloud computing are all very typical cases of human-computer interaction.
The development history of HCI and a large number of opinions and cases prove that technology can significantly improve HCI.
With the breakthroughs and development of communication and information technology, they continue to bring significant impact and improvement to HCI. For example, RPA, which has flourished with the help of AI technology in recent years, has brought great improvements in human-computer interaction and experience to business process automation and office business scenarios.
Human-computer interaction and RPA
We mentioned earlier that the goal of human-computer interaction is to enable computers to better adapt to human needs and provide friendlier, smarter, and more natural interaction methods, such as speech recognition, image recognition, natural language processing, and gestures. control etc.
RPA is a technology that uses software robots to simulate human operations. It can interact with enterprise application systems through user interfaces and complete expected tasks.
Contemporary RPA also incorporates artificial intelligence (AI) and machine learning (ML) to achieve intelligent process automation (IPA) and handle more complex use cases such as natural language processing (NLP), computer vision (CV) and Data analysis, etc.
RPA can automate repetitive, rule-based workflows, improve work efficiency, accuracy and compliance, reduce labor costs, reduce error rates, save costs and time, and is suitable for various repetitive and standardized tasks. Business scenarios, such as finance, human resources, supply chain, information technology, etc.
Wang Jiwei Channel once said in the article "In the era of digital transformation, RPA AI is the best way to open up human-machine collaboration", in contemporary enterprise management software systems and various automation tools, from the perspective of operational difficulty, deployment cycle, investment From a cost perspective, RPA can be regarded as the best way for organizations to apply human-machine collaboration.
Among them, the biggest advantage of RPA is that it reduces the difficulty of program development, allows front-line business personnel to participate in the development of simple applications, and makes national development a further reality.
The reason why RPA can do this is that it changes the human-computer interaction model of program development. This allows ordinary employees who cannot program to use RPA tools to develop the automation programs or software robots they need like programmers.
On the one hand, RPA makes developing programs easier, from writing code to "drag-and-drop" various functional components; on the other hand, it can automate more business processes, no longer requiring manpower. Repeat.It can be said that RPA has changed the human-computer interaction of program development and business execution at the same time.
So, RPA is closely related to human-computer interaction. Because RPA is essentially a human-machine collaborative working model, it requires humans to define rules, supervise execution, optimize and improve, while machines are responsible for executing rules, providing feedback, and learning and improving.
RPA can not only simulate human operations, but also combine AI technology to achieve human understanding and decision-making. For example, OCR (optical character recognition) technology is used to identify text in images, NLP technology is used to understand the intention in language, and intelligent decision-making technology is used to formulate optimal solutions.
RPA that integrates AI and other technologies has the following advantages:
1. Effectively reduce work burden, freeing people from tedious background tasks and focusing on more valuable innovation and strategic work;
2. Improve the speed and quality of human-computer interaction. Software robots can work around the clock, not affected by time, location and emotions, and will not make mistakes or omissions;
3. Expand the scope and depth of human-computer interaction. Software robots can access and integrate multiple unrelated software systems, process large amounts of structured and unstructured data, and use the capabilities of AI and ML for learning and optimization.
Thus, RPA is an effective and typical technology for optimizing human-computer interaction. It can realize process automation, intelligence and optimization, bringing improvements in efficiency, quality and value to enterprises.
The impact of LLM on human-computer interaction
LLM is a language model that uses neural networks to perform self-supervised learning or semi-supervised learning on a large amount of unlabeled text. LLM has a huge number of parameters (usually billions or more) and can show excellent performance on a variety of tasks.
Judging from current applications in various fields, the emergence of generative AI technology based on LLM has brought disruptive changes to human-computer interaction.
The most direct feeling that generative AI gives people is that many various software operations and cross-software operations in the original workflow can now be completed with just a few rounds of dialogue with generative AI.
For example, using Midjourney to generate pictures or ChatGPT Plus to generate software application code, there is no need to use drawing software and programming software at all. Moreover, the plug-in ecosystem of ChatGPT is rapidly improving. In the future, there will be more and more business operations in application scenarios, which can be completed with just dialogue.
This is a change in the way of interaction. It directly changes the original human-computer interaction with various software UIs into interaction with a chat window, which is an unprecedented interactive experience.
To sum up, LLM or generative AI has the following impacts on human-computer interaction:
First of all, it improves the efficiency, quality and convenience of human-computer interaction.Through generative AI, users can quickly obtain the information or services they want without spending a lot of time and energy. At the same time, LLM can generate appropriate responses based on user input and context, reducing user input burden and improving interaction fluency and naturalness. In addition, generative AI can dynamically adjust its output based on user feedback and preferences to achieve better interactive effects.
For example, ChatGPT can help users complete complex tasks such as writing, design, and programming, or provide users with personalized recommendations, consultation, entertainment, and other content.
Secondly, increase the diversity and creativity of human-computer interaction.LLM can generate different styles of text, audio, video and other content based on user needs and preferences to meet users' personalized and diversified needs. Through generative AI, users can access and choose more content, thereby expanding their horizons and thinking. Of course, generative AI can also have more in-depth and flexible conversations with users to meet their different emotional and emotional needs.
For example, generative AI can be used to provide users with texts, images, music, etc. of different styles and themes, or to generate some novel and interesting content for users, such as poems, stories, jokes, etc.
Third, change the relationship and meaning of human-computer interaction.Through generative AI, users can establish a closer and trusting connection with artificial intelligence, and even create a sense of co-creation and cooperation.
Chatbots based on LLM can provide users with more feedback and suggestions, or share their thoughts and feelings with users. Generative AI can also make users more aware of their own and artificial intelligence’s strengths and limitations, and how to better utilize and develop them.
Fourth, expand the fields and scenarios of human-computer interaction.Generative AI applications such as ChatGPT have strong adaptability and generalization and can be applied to various fields and scenarios, such as education, entertainment, medical care, business, etc. Whether users want to learn, play, consult, shop, etc., they can achieve their goals by communicating with applications such as ChatGPT.
Fifth, enhance the fun and intimacy of human-computer interaction.Generative AI applications based on LLM have rich knowledge and personality. They can adjust their language style and topics according to the user's interests and emotions, and can even generate some humor, poetry, stories and other creative content to entertain users.
In this way, users will not feel that communicating with the robot is a boring thing, but will feel that communicating with the robot is an interesting and warm thing.
LLM has an important and complex impact on human-computer interaction, giving it great development potential and industry application value in various fields. Organizations should actively explore and utilize LLM and generative AI to improve the level and experience of human-computer interaction, improve the efficiency and quality of human-computer interaction, enhance human-computer interaction relationships, and expand the fields and scenarios of human-computer interaction.
Of course, we should also pay attention to the risks and challenges it brings, and how to use and supervise it reasonably.
It should be noted that generative AI based on large language models is rapidly integrating with RPA. Generative AI will bring a qualitative leap to RPA’s human-computer interaction.
LLM changes RPA human-computer interaction
RPA can automate repetitive, regular and low-value business processes, which can improve efficiency, reduce costs and reduce errors. But it also faces some challenges and limitations, such as difficulty in handling complex, changing and high-value business scenarios, difficulty in adapting to changes in business processes, the need for constant maintenance and updates, and difficulty in handling complex, unstructured, and Tasks that require creativity or judgment, etc.
Although the hyper-automated architecture has made the operation of RPA stable enough, there are also hidden dangers in stable operation for complex processes.
In the past, manufacturers tried various ways to solve these problems, but they could not fundamentally eliminate these problems. Until the emergence of generative AI based on LLM, it suddenly solved the multiple problems encountered by RPA before.
As for how LLM affects RPA, Wang Jiwei channel (id: jiwei1122) has already introduced it in detail in the article "Big AI models such as GPT are coming, super automation based on RPA is still the best implementation carrier".
Here, let’s briefly talk about how LLM changes the human-computer interaction of RPA.
LLM can provide RPA with more powerful natural language processing capabilities, more powerful knowledge acquisition and reasoning capabilities, and more powerful generation and creation capabilities.
Specifically, the impact of LLM on RPA human-computer interaction can be reflected in the following aspects:
Improve the intelligence level of RPA.Applying LLM can better identify and understand the user's natural language input and generate natural language to better meet the user's needs and intentions. It can also generate appropriate operation steps based on context and goals, conduct multiple rounds of dialogue and reasoning, handle more complex and diverse business scenarios, and achieve more complex and flexible business process automation.
Users can talk to RPA through voice or text and tell it what tasks to perform, without the need to design processes through complex programming or drag-and-drop components.
In addition, LLM can also help RPA perform knowledge extraction and reasoning, thereby providing more valuable information and suggestions.
Expand the application scope of RPA.LLM can effectively expand the application scope of RPA, allowing software robots to handle more tasks involving natural language, such as text classification, text summarization, text generation, machine translation, question and answer systems, etc. It can also interact with data in other modalities, such as images, audio, video, etc., to achieve richer and multi-dimensional business processes.
LLM also allows software robots to integrate and collaborate with other AI technologies such as OCR, NLP, low code, process mining, chatbot, etc., to achieve super automation.
By using LLM, RPA can transcend language and cultural barriers and serve a wider and more diverse range of customers and markets.
Increase the innovation potential of RPA.LLM can enhance the creativity and flexibility of RPA, enabling it to generate suitable text, such as reports, summaries, recommendations, etc., based on different scenarios and data. For example, RPA can automatically generate a blog article based on the keywords or topics provided by the user, and insert relevant pictures, videos, links, etc. into the article.
By using LLM, RPA can perform more flexible and adaptive learning and generation, producing more novel and interesting content and solutions. LLM can also collaborate and communicate more effectively and friendlyly with humans, inspiring more creativity and inspiration.
Improve RPA development efficiency.Generative AI allows users to define and modify business processes through simple language descriptions without writing complex codes or using graphical interfaces. And can optimize and adjust business processes based on user feedback and data analysis to achieve continuous improvement.
Optimize RPA interaction experience and user satisfaction.RPA integrated with LLM can have more natural, friendly, and interesting conversations with users, increasing users' trust and participation. RPA can adjust the tone and style according to the user's emotions and interests, and even tell some humor or quote some famous quotes to adjust the atmosphere.
Extended reading: ChatGPT is integrated with RPA, and the generative AI automated process doubles the value of AIGC
Of course, the impact of LLM on RPA human-computer interaction is not only at the level of intelligence, efficiency and innovation, it also directly affects the changes in RPA software architecture.
Postscript: RPA architecture changes under the influence of LLM
Before LLM, RPA had greatly improved human-computer interaction in program development and process automation. Moreover, many manufacturers have already launched the concept of "RPA available to everyone". Behind this concept, RPA is becoming increasingly easier to use, making it easier to develop programs and implement process automation using it.
In terms of ease of use, manufacturers have made a lot of explorations and attempts, from CV to screen capture to AI models. In the RPA program development process, based on AI, 0-code and other technologies, RPA is gradually getting rid of the original "drag and drop" form and transitioning to "click to use" and conversational (including voice-driven) process creation methods.
In terms of human-computer interaction, conversational process creation can be said to be the ultimate state of RPA and even hyperautomation. In the future, we will use hyperautomation. We can create various software robots or automated programs by typing a few lines or saying a sentence in the system.
But the previous conversational creation is only suitable for simple preset process. A slightly more complex process is ineffective, or requires more process steps to trigger and mobilize more processes to achieve. The robustness of the process is difficult to test, and users must be familiar with the corresponding syntax and instructions to use it.
In terms of application experience, there are still some shortcomings or room for improvement.
After the emergence of LLM, for RPA products that incorporate generative AI, users can drive RPA to create processes using natural language.
And generative AI makes up for RPA's shortcomings in emotion recognition, unstructured data processing, etc. in the form of generated content, allowing anyone to drive RPA to develop various types of development more simply, quickly and efficiently without much learning. Automated procedures truly make RPA available to everyone.
In the past, when using RPA, people directly operated RPA to build various programs by dragging and pulling building blocks. Now people communicate with generative AI such as GPT through natural language. After understanding human operation intentions, multi-modal AI further drives RPA to connect enterprise management software to automate various business processes.
AI large models such as GPT further connect people with systems such as RPA, connecting people's intentions upwards and directing RPA robots downwards, becoming a link between people and automated systems such as RPA, allowing program development and automated processes to Operation is simpler.
GPT connects people and RPA-based hyperautomation, which is a huge progress in human-computer interaction experience.
According to Wang Jiwei Channel, from the past "human RPA" to the current "human-generated AI RPA", the introduction of LLM and integrated generative AI has greatly improved the human-computer interaction of RPA products, but in essence The above is that LLM affects the architectural changes of RPA.
Now almost all manufacturers are heavily researching the comprehensive integration of LLM, RPA and hyper-automation, and RPA has added the model layer to the product architecture.
This means that whether it is calling a third-party model or a self-developed model, RPA has become a standard application on the model layer.
It can be predicted that as LLM becomes the standard configuration of RPA, it will also comprehensively revolutionize RPA in the era of large models.
The above is the detailed content of From 'human + RPA' to 'human + generative AI + RPA', how does LLM affect RPA human-computer interaction?. For more information, please follow other related articles on the PHP Chinese website!