Zen AI Blog

Towards AI Workflows for Software Engineering

Written by Lenz Belzner | May 24, 2024 4:13:14 PM

Hey everyone, Lenz here, and I'm genuinely delighted to be sharing with you today. Let's cut to the chase: I'm here to demystify a bit of the hype surrounding artificial intelligence (AI) and its application within software engineering. No coats and ties needed — just some thoughts about the future we're all shaping together.

Before we dive into the thick of it, I want to acknowledge something: AI isn't just a fancy term; it's reshaping the landscape of technology as we know it. This isn't always easy to navigate, especially given the nuances and complexities that come with implementing AI in actual projects.

A Quick Background

I started off my journey as computer scientist with a Ph.D. from LMU, focusing on autonomous decision-making systems using reinforcement learning. Sounds fancy, right? After my academic pursuits, I transitioned into a three-year leadership role as a business unit manager at MaibornWolff, where I built up the Data Science and AI business unit from the ground up. Currently, I hold a research professorship on software methodologies for autonomous mobility systems at the AImotion institute at THI in Ingolstadt. Simply put, we're engineering the brains behind self-driving cars and mobility solutions.

My evolving interest in language technologies prominently sparked when GPT-3 hit the scene in Winter 2022. That period, I must say, transformed my outlook on the sheer potential of conversational AI—yes, including chatbots and language models like GPT. Today, I run a company where we unlock their potential for software engineering, and I'm here to share some of these insights.

What’s on This Post's Agenda?

We'll kick things off with a quick dive into what AI actually entails, hopefully dialing down the hype a notch. Next, I'll present a few solid examples of how AI can be harnessed in software engineering, particularly on tasks that don’t involve pure coding. And then, I’ll wrap up with some reflections on what these advancements mean for you and how you can leverage them for optimal outcomes. By the end, you'll have a robust understanding of the fundamentals.

Alright then, gear up. Let’s get started with a foundational overview of AI to set the stage for the more hands-on discussions to follow.

Understanding the Essence of AI

AI, in its fundamental form, is about machines mimicking cognitive functions like learning and problem-solving. The recent buzz has been particularly around something called "Large Language Models" or LLMs, which are what power tools like GPT. Imagine feeding a colossal dataset comprising almost all the text available on the Internet into a machine. Now, train this machine to predict the next word in a sequence of text—a glorified word completion game of sorts. This huge prediction engine, layered with a neural network, essentially sums up the underlying "magic" of these models.

Cracking the Code

At its core, an LLM like GPT-4 is trained by taking a sentence and predicting the next word. We feed this data into a neural network and tweak it until its predictions align well with our vast dataset. It sounds overly simplified, but that’s the crux. Of course, in detail there's more complexity involved, but fundamentally, that's how it works.

Once trained, these models are then fine-tuned for completing to tasks rather than to predict words. This is achieved by generating multiple solutions for a given problem and getting human feedback on which is best. This feedback is in turn used to train a 'reward model' approximating human preferences for task completion. The reward model then helps to train the final LLM to maximize human preference via reinforcement learning.

These steps build a comprehensive system that excels not just at completing text, but at also at emulating understanding and solving diverse problems. When we leverage these models in real-world applications, they efficiently provide solutions by simply executing the fixed representations trainers have defined.

The Unique Capabilities of AI

What makes these AI systems so remarkable are their newly unlocked abilities. Remarkably, each of these capabilities can now be accessed for a nominal computational cost, dramatically lowering previous barriers to entry that would have required significant investments in risk-laden projects.

Commonsense Reasoning

These models now possess a form of machine commonsense, allowing them to understand contexts and respond appropriately even without detailed domain-specific information.

A fellow colleague of mine, working as a Data Lakehouse architect at a prominent German automotive manufacturer, had an insightful interaction with ChatGPT. He approached the model with a seemingly straightforward request: to discuss a data lakehouse architecture. However, the AI, in its initial attempt, misunderstood "lakehouse" quite literally, starting to describe the construction of a lakeside house, foundation and all.

Realizing the confusion, my colleague clarified that he was referring to a data lakehouse architecture. In an impressive display of adaptability, the language model promptly revised its approach, interpreting and responding correctly within the new context. This seamless shift, triggered by a single term clarification, underscores AI’s powerful contextual understanding. Such instances vividly demonstrate how AI can navigate and adapt to complex, domain-specific dialogues, making it an invaluable tool in contemporary technical discussions.

This interaction exemplifies the sophisticated level of "commonsense reasoning" these models possess, allowing professionals to engage in more intuitive and efficient problem-solving dialogues with AI.

Data Transformation

We've developed tools enabling these models to transform data and execute instructions through natural language programming.

One of the most remarkable features of modern AI is its proficiency in transforming data using natural language commands. Picture this: you're tasked with converting a detailed project specification document into a structured database schema. Traditionally, this would involve meticulous manual work, interpreting each requirement and translating it into database tables, attributes, and relationships.

With AI, this process becomes significantly more straightforward. You simply provide the language model with the project requirements in natural language, instructing it to generate the corresponding SQL commands or schema configurations. The AI processes these instructions and outputs a detailed, formal representation of the database schema, saving hours of manual labor.

This capability extends beyond just database schemas. For instance, you can instruct the AI to generate UML diagrams, convert user stories into detailed task lists, or even translate complex architectural concepts into visual workflows. By leveraging natural language, AI democratizes access to complex data transformation tasks, enabling even those without deep technical expertise to participate in and contribute to these processes efficiently.

Such transformative power fundamentally alters how we handle and visualize data, making sophisticated data management tasks more accessible and streamlined than ever before.

Emergent Behavior

Although being trained to react to task descriptions in a textual manner, LLMs emergently show other capabilities as well. For example, they exhibit goal-oriented. agentic behavior, and abilities like abstracting tasks and decomposing problems into manageable parts, though their exact capabilities are still being explored. We'll explore this in a future post.

Let's now see what this all means in practice.

Practical AI in Software Engineering

AI’s Real-World Impact on Software Engineering

Alright, with our foundation laid, let's transition to the practical realm—how AI integrates into the workflows typical in software engineering. We'll focus on sections of the development lifecycle where AI proves especially valuable, beyond mere coding. Imagine embarking on a journey, navigating through tasks like requirements analysis, architecture design, and project management. AI can be a formidable ally in streamlining these tasks.

A Common Scenario: Proposal Preparation

Ever faced the daunting task of preparing a response to a request for proposal (RFP)? These documents can be extensive and intricate, often stretching hundreds of pages. Picture this: you receive an RFP that’s a classic embodiment of complexity. It’s from a reputable client looking for an event management system to serve thousands of users.

Now, sifting through this document is no small feat. Typically, it might take hours if not days, to digest, synthesize, and respond effectively. Here’s where AI steps in. Let’s dive into a real-world example to illustrate this.

Breaking Down the RFP with AI

Imagine taking this comprehensive RFP and feeding it into an AI system. The AI scans and digests the document and provides a summarization. Your prompt might be: "Please summarize this RFP while retaining crucial details necessary for crafting a precise solution." The output is not just a reduction in text but a coherent summary highlighting key requirements and objectives. This sets the stage for a structured response.

Here’s a pro-tip: Not everything can be automated seamlessly yet. At this level of complexity, chunking the task into manageable portions usually yields better results. Once summarized, you can ask the AI to extract and clarify uncertainties. A prompt like, "Identify open questions within this RFP that need clarification to tailor an accurate proposal," becomes invaluable. What you get is a comprehensive list of intelligent questions you might pose to the client—questions that perhaps you hadn't initially considered.

Consider the task of defining roles and permissions for the proposed system. The AI here can be extremely effective. For instance, you might instruct, "Generate potential roles and define their respective permissions based on this RFP." The AI proposes roles such as "Event Manager," "System Administrator," and "User," each with well-defined permissions. This not only accelerates your initial setup but provides a clear framework for detailed discussions and iterations.

This is an iterative process, mind you. Review each AI-generated output, tweak where necessary, and refine. AI currently excels in assisting with these repetitive yet slightly variant tasks, ensuring that you maintain strategic oversight without getting bogged down in minutiae.

Story Mapping and Task Identification

As you continue, translating the summarized requirements into actionable work packages becomes the next hurdle. Imagine you're building an agile structure. Your prompt might be, "Draft initial user stories from these summarized requirements." The AI churns out a series of user stories—some may need fine-tuning, but it gives you a significant head start. For instance, a user story could be, "As an Event Manager, I want to create and manage event schedules to efficiently organize events."

From user stories, you can extract epics and tasks. A further prompt may be, "Break down this user story into tasks and acceptance criteria." The AI dissects it into actionable chunks: tasks like "Design the event creation form," and "Implement schedule management features," with corresponding acceptance criteria.

Through this method, every snippet of information is distilled and organized. Importantly, AI here is not about handing over the reins completely but augmenting your capacity to handle expansive and complex documents efficiently, allowing you more time to refine and validate the outcomes.

Enhancing Software Modeling and Visualization

Turning Text into Visuals

Let’s push further into how AI supports us beyond organizing information—specifically, how it aids in transforming text-based requirements into visual models. Software engineering heavily relies on visual models for clarity and communication. From UML diagrams to simple flowcharts, these visualizations make complex systems comprehensible at a glance.

From User Stories to Diagrams

Imagine you've established your user stories and tasks from the previous section. Now comes the part where you need to represent these stories visually. You might say, “Transform this user story into a UML use case diagram.” The AI moves into action, generating a diagram that maps out actors, use cases, and their relationships.

Here's a more concrete example. You have a user story: "As a System Administrator, I want to manage user permissions to ensure proper access control." You prompt the AI: “Create a UML diagram for this user story.” In response, you get a use case diagram where the "System Administrator" is tied to use cases like "Manage Permissions" and actors relevant to the system.

Enhancing Data Models

AI doesn't stop at user stories; it extends to data modeling too. Let's say you need to define the database structure for your application. Begin with a textual description: “List entities and their relationships for an event management system.” The AI outlines entities like "Event," "User," "Location," each with attributes and relationships. Next, you prompt: “Generate an ER (Entity-Relationship) diagram based on this entity list.”

The AI produces a textual representation of the ER model that can then be converted into a visual diagram using tools like PlantUML or a similar text-to-diagram generation tool. The generated diagram provides a clear visual representation of how each entity interacts, forming a backbone for your database schema.

Iterative Refinement

Remember, these outputs aren’t always perfect but serve as robust starting points. Nowadays, tools like Mermaid or Excalidraw are at your disposal. For instance, say you need to tweak the generated diagram to reflect a more accurate representation or fit into your preferred tool. Export the text-based model to Mermaid syntax and further refine it using Excalidraw, adjusting node positions, labels, and relationships to match your envisioned structure precisely.

Dealing with Complex Systems

As systems get more complex, the need for detailed hierarchical models increases. Suppose you’re working on a feature that involves various modules interacting together. Ask the AI: “Outline the architecture for a modular system handling event registrations, notifications, and reporting.” Based on this textual architecture description, subsequent prompts guide the AI to visually represent these components and their interconnections.

You might go a step further: “Convert this architecture outline into a C4 model diagram.” The AI generates a high-level Context and Container diagram highlighting the main components, their roles, and how they intercommunicate. This visualization becomes instrumental in project discussions, sprint planning, and stakeholder presentations.

AI Beyond Text - Exploring Multimodal AI Capabilities

While the abilities of AI in handling text are impressive, the scope of what these models can achieve goes much further. The concept of multimodal capabilities refers to the AI's proficiency in processing and interpreting multiple forms of data — such as text, audio, and images. This diversity extends the utility of AI beyond straightforward text interactions, opening up a plethora of practical applications in varied contexts.

Audio Transcription and Analysis

To illustrate, here's a practical scenario: You're in a meeting or a brainstorming session, discussing the features and architecture of a new system. Traditionally, capturing the richness of such conversations would require extensive note-taking, often leading to loss of crucial points in the hustle of the moment.

Imagine recording the meeting and then utilizing AI to transcribe the entire conversation in near real-time. Last autumn, I had an experience where my colleague and I were conceptualizing an API project while sitting at a beer garden. We recorded our discussion on a smartphone. The AI flawlessly transcribed our conversation, converting our spoken words into a detailed text document. This transcript then became a foundational document, from which we derived design specifications, task lists, and even timeline estimates.

The real magic happens post-transcription. You can take the transcribed text and feed it into language models to generate detailed project specifications as described above. This turn a colloquial conversation about system specs into a fully fledged system design in the fraction of the time it needed when not having AI available to support with this task.

Image Processing and Interpretation

Audio transcription is just the tip of the iceberg. AI's capabilities extend to image processing as well. Consider this: In software engineering and development, diagrams are indispensable. A large part of our work involves creating and understanding UML diagrams, flowcharts, architectural blueprints, and so on.

With AI, you can now process images of these diagrams to extract meaningful data. For instance, you could upload a hand-drawn architecture sketch or a whiteboard snapshot, and the AI can interpret and convert it into a digital, editable format. This doesn't just save time but also ensures that the nuances captured in freehand drawings are preserved and formalized digitally.

Document and PDF Processing

Yet another multimodal capability lies in document and PDF processing. Many RFPs, specifications, and project documents are shared as large PDF files. AI can process these extensive documents, extracting tabular data, identifying sections, and even translating visual elements like charts and graphs into navigable data structures.

For example, with a complex multi-page PDF, you could instruct the AI to “Extract all tables and convert them to CSV format” or “Summarize section headers and generate an outline.” This ability to parse and reorganize large documents data ensures that essential information is easily accessible and can be manipulated programmatically.

Enhancing Workflow Integration

Integration of these multimodal capabilities into existing workflows is seamless and transformative. Picture this: You're working on a detailed project proposal. You have text specifications, voice recordings of client meetings, and various architectural sketches. AI can integrate all these diverse inputs, providing a cohesive, comprehensive output that blends textual data, transcribed meetings, and digitized sketches.

By doing so, AI creates a unified, navigable dataset from disparate sources, significantly enhancing the workflow efficiency. This integration not only speeds up the initial phases of understanding and planning but also ensures robust documentation and a more agile response to evolving project requirements.

Key Takeaways and Future Implications

By now, we've journeyed through some parts of the versatile landscape of AI in software engineering. From synthesizing complex RFPs to transforming textual user stories into clear visual models, AI proves itself as a potent tool facilitating various stages of the software development lifecycle. But let’s condense these insights and explore what's on the horizon for integrating AI further into our workflows. In this post, we've discussed AI use cases for software engineering such as:

  1. Summarization and Clarification: AI shines in quickly summarizing large documents (e.g. requests for proposals) and flagging areas of ambiguity. Using AI, you can derive understandable summaries and pose intelligent questions that ensure no stone is left unturned, and that proposals are scoped concisely.

  2. Role Definition and Management: By prompting AI to outline roles and permissions, initial task setup becomes more streamlined. This gives you a practical framework that can be further discussed and refined with stakeholders.

  3. Iterative User Story Development: Breaking down user stories into epics and tasks, and then further refining them into actionable tasks and acceptance criteria, reflects AI’s ability to enhance agile practices.

  4. Visual Documentation: The transformation of user stories and requirements into UML diagrams and ER models aids in visualizing system components and their interactions, fostering clearer communication within teams.

  5. Handling Diverse Data Types: Leveraging AI for transcribing audio, interpreting diagrams, and summarizing text captures and retains comprehensive project details, making them readily accessible and actionable.

One of the most remarkable traits of AI is its adaptability. It learns from interactions, fine-tunes its output based on feedback, and consistently improves if you leverage your unique AI interaction data.

The more nuanced our prompts and the better we structure our workflows, the more aligned AI becomes with our requirements. This not only drives efficiency but also enhances the quality of deliverables. Interestingly, AI can help us to do so by observing our workflows and interactions with it. I'll dive into this topic in a future post.

Future Prospects: AI as a Collaborative Partner

Looking forward, the interaction between AI and software engineering will grow symbiotic. Here are some future implications to consider:

  1. Increased Model Sophistication: Expect AI models to evolve, becoming more adept at handling domain-specific tasks with higher accuracy and reliability. This includes greater contextual awareness, reducing the need for constant human oversight.

  2. Ethical and Legal Compliance: As AI becomes integral in handling sensitive data, ensuring compliance with privacy standards and ethical guidelines will be critical. Implementing robust frameworks to manage data securely remains a priority.

  3. Specific AI Solutions: AI solutions tailored to specific business needs - for example for software engineering professionals - will leverage domain-specific data (Github, Linear, Jira, Confluence, Miro, Figma, etc.), user experience (e.g. digital whiteboards for AI, or node-based AI flows), and AI workflow assistance. We have seen examples of some software engineering AI workflows in this post.

  4. Human-AI Collaboration: The future isn't solely automated—human expertise and AI will coexist, each complementing the other's strengths. While AI handles data processing and routine tasks, human creativity and strategic thinking will guide the bigger picture.

  5. Training and Skill Development: Investing in continuous learning and upskilling in AI technologies will empower teams to harness these tools effectively, driving innovation and maintaining a competitive edge on AI in your business.

    Concluding Thoughts

We've discussed some of the tangible benefits AI brings to software engineering while acknowledging the areas needing human intuition and decision-making. By embracing AI, we open doors to unprecedented efficiency, clarity, and creativity in our workflows.

In transforming how we approach software engineering, AI offers a paradigm shift. It’s not just about faster results but about smarter processes, better decision-making, and ultimately, more innovative solutions. As we march into this exciting future, let’s wield AI not as a tool of replacement but as an instrument of augmentation—supporting and enhancing our ability to create and innovate.

With these thoughts, I invite you to explore, experiment, and embrace AI in your projects. The journey has only just begun.

Thank you for joining me on this deep dive into AI in software engineering. Let's keep pushing the boundaries of what we can achieve together.

Lenz