Menguasai Corak Pembina: Cipta CLI Penjana Gesaan AI Dinamik

Susan Sarandon
Lepaskan: 2024-11-15 05:30:03
asal
985 orang telah melayarinya

Pernahkah anda menghadapi kes dalam perjalanan pembangunan anda di mana anda terpaksa berurusan dengan objek yang kompleks? Mungkin kerana parameter tersebut sama ada mempunyai terlalu banyak parameter, malah boleh disarangkan, atau memerlukan banyak langkah pembinaan dan logik kompleks untuk dibina.

Mungkin anda ingin mereka bentuk modul dengan antara muka yang bersih dan mudah tanpa perlu berselerak atau memikirkan tentang kod penciptaan objek kompleks anda setiap kali!

Di situlah corak reka bentuk pembina masuk!

Sepanjang tutorial ini, kami akan menerangkan segala-galanya tentang corak reka bentuk pembina, kemudian kami akan membina aplikasi CLI Node.js untuk menjana gesaan penjanaan imej yang dioptimumkan DALL-E 3 menggunakan corak reka bentuk pembina .

Kod akhir tersedia dalam Repositori Github ini.

Gambaran keseluruhan

Masalah

Pembina ialah corak reka bentuk ciptaan , iaitu kategori corak reka bentuk yang menangani masalah berbeza yang datang dengan cara asli mencipta objek dengan baharu kata kunci atau operator.

Corak Reka Bentuk Pembina menumpukan pada menyelesaikan masalah berikut:

  1. Menyediakan antara muka yang mudah untuk mencipta objek yang kompleks : Bayangkan objek bersarang dalam dengan banyak langkah permulaan yang diperlukan.

  2. Memisahkan kod pembinaan daripada objek itu sendiri , membenarkan penciptaan berbilang perwakilan atau konfigurasi daripada objek yang sama.

Penyelesaian

Corak Reka Bentuk Pembina menyelesaikan dua masalah ini dengan menyerahkan tanggungjawab penciptaan objek kepada objek khas yang dipanggil pembina.

Objek pembina menyusun objek asal dan memecahkan proses penciptaan kepada beberapa peringkat atau langkah.

Setiap langkah ditakrifkan oleh kaedah dalam objek pembina yang memulakan subset atribut objek berdasarkan beberapa logik perniagaan.

class PromptBuilder {
  private prompt: Prompt

  constructor() {
    this.reset()
  }

  reset() {
    this.prompt = new Prompt()
  }

  buildStep1() {
    this.prompt.subject = "A cheese eating a burger"
    //initialization code...
    return this
  }

  buildStep2() {
    //initialization code...
    return this
  }

  buildStep3() {
    //initialization code...
    return this
  }

  build() {
    const result = structuredClone(this.prompt) // deep clone
    this.reset()
    return result
  }
}

Salin selepas log masuk

Kod pelanggan: kita hanya perlu menggunakan pembina dan memanggil langkah individu

const promptBuilder = new PromptBuilder()
const prompt1 = promptBuilder
  .buildStep1() // optional
  .buildStep2() // optional
  .buildStep3() // optional
  .build() // we've got a prompt

const prompt2 = promptBuilder
  .buildStep1() // optional
  .buildStep3() // optional
  .build() // we've got a prompt

Salin selepas log masuk

Corak reka bentuk pembina biasa

Mastering the Builder Pattern: Create a Dynamic AI Prompt Generator CLI

Corak reka bentuk pembina biasa terdiri daripada 4 kelas utama:

  1. Pembina : Antara muka pembina hendaklah hanya mentakrifkan kaedah pembinaan tanpa kaedah bina(), yang bertanggungjawab untuk mengembalikan entiti yang dicipta.

  2. Kelas Pembina Konkrit : Setiap Pembina konkrit menyediakan pelaksanaan sendiri kaedah Antara Muka Pembina supaya ia boleh menghasilkan varian objeknya sendiri (contoh Produk1 atau Produk2 ).

  3. Client : You can think of the client as the top-level consumer of our objects, the user who is importing library modules or the entry point of our application.

  4. Director : Even the same builder object can produce many variants of the object.

const promptBuilder = new PromptBuilder()
const prompt1 = promptBuilder.buildStep1().buildStep2().build()

const prompt2 = promptBuilder.buildStep1().buildStep3().build()

Salin selepas log masuk

As you can see from the code above, there is a big need for some entity to take the responsibility of directing or orchestrating the different possible combination sequences of calls to the builder methods, as each sequence may produce a different resulting object.

So can we further abstract the process and provide an even simpler interface for the client code?

That's where the Director class comes in. The director takes more responsibilities from the client and allows us to factor all of those builder sequence calls and reuse them as needed.

class Director {
  private builder: PromptBuilder
  constructor() {}

  setBuilder(builder: PromptBuilder) {
    this.builder = builder
  }

  makePrompt1() {
    return this.builder.buildStep1().buildStep2().build()
  }

  makePrompt2() {
    return this.builder.buildStep1().buildStep3().build()
  }
}

Salin selepas log masuk

Client code

const director = new Director()
const builder = new PromptBuilder()
director.setBuilder(builder)
const prompt1 = director.makePrompt1()
const prompt2 = director.makePrompt2()

Salin selepas log masuk

As you can see from the code above, the client code doesn't need to know about the details for creating prompt1 or prompt2. It just calls the director, sets the correct builder object, and then calls the makePrompt methods.

Practical Scenario

To further demonstrate the builder design pattern's usefulness, let's build a prompt engineering image generation AI CLI tool from scratch.

The source code for this CLI app is available here.

The CLI tool will work as follows:

  1. The CLI will prompt the user to choose one style of prompts: Realistic or Digital Art.
  2. Then it will ask the user to enter a subject for their prompt, for example: a cheese eating a burger.
  3. Depending on your choice (Digital Art or Realistic), the CLI tool will create complex prompt objects with many configuration details.

The realistic prompt will need all of the following configuration attributes to be constructed.

file: prompts.ts

class RealisticPhotoPrompt {
  constructor(
    public subject: string,
    public location: string,
    public timeOfDay: string,
    public weather: string,
    public camera: CameraType,
    public lens: LensType,
    public focalLength: number,
    public aperture: string,
    public iso: number,
    public shutterSpeed: string,
    public lighting: LightingCondition,
    public composition: CompositionRule,
    public perspective: string,
    public foregroundElements: string[],
    public backgroundElements: string[],
    public colorScheme: ColorScheme,
    public resolution: ImageResolution,
    public postProcessing: string[]
  ) {}
}

Salin selepas log masuk

file: prompts.ts

class DigitalArtPrompt {
  constructor(
    public subject: string,
    public artStyle: ArtStyle,
    public colorPalette: string[],
    public brushTechnique: BrushTechnique,
    public canvas: {
      width: number
      height: number
      resolution: ImageResolution
    },
    public layers: number,
    public composition: CompositionRule,
    public perspective: string,
    public lightingEffect: string,
    public textureDetails: string[],
    public backgroundTheme: string,
    public foregroundElements: string[],
    public moodKeywords: string[],
    public artisticInfluences: string[],
    public digitalEffects: string[]
  ) {}
}

Salin selepas log masuk

As you can see here, each prompt type requires many complex attributes to be constructed, like artStyle , colorPalette , lightingEffect , perspective , cameraType , etc.

Feel free to explore all of the attribute details, which are defined in the enums.ts file of our project.

enums.ts

// Shared Enums
export enum ImageResolution {
  Low = "512x512",
  Medium = "1024x1024",
  High = "2048x2048",
}

export enum ColorScheme {
  Vibrant = "Vibrant",
  Pastel = "Pastel",
  Monochrome = "Monochrome",
  Sepia = "Sepia",
}

// Realistic Photo Prompt enums
export enum CameraType {
  DSLR = "DSLR",
  Mirrorless = "Mirrorless",
  Smartphone = "Smartphone",
  Drone = "Drone",
}

export enum LensType {
  WideAngle = "Wide Angle",
  Telephoto = "Telephoto",
  Macro = "Macro",
  FishEye = "Fish Eye",
}

export enum LightingCondition {
  Natural = "Natural",
  Studio = "Studio",
  LowLight = "Low Light",
  HighContrast = "High Contrast",
}

// Digital Art Prompt enums
export enum ArtStyle {
  Impressionist = "Impressionist",
  Surrealist = "Surrealist",
  PixelArt = "Pixel Art",
  Cyberpunk = "Cyberpunk",
}

export enum BrushTechnique {
  Impasto = "Impasto",
  Watercolor = "Watercolor",
  Airbrush = "Airbrush",
  DigitalPen = "Digital Pen",
}

export enum CompositionRule {
  RuleOfThirds = "Rule of Thirds",
  GoldenRatio = "Golden Ratio",
  Symmetry = "Symmetry",
  LeadingLines = "Leading Lines",
}

Salin selepas log masuk

The user of our CLI app may not be aware of all these configurations; they may just want to generate an image based on a specific subject like cheese eating burger and style (Realistic or Digital Art).

After cloning the Github repository, install the dependencies using the following command:

npm install

Salin selepas log masuk

After installing the dependencies, run the following command:

npm start

Salin selepas log masuk

You'll be prompted to choose a prompt type: Realistic or Digital Art. Mastering the Builder Pattern: Create a Dynamic AI Prompt Generator CLI

Then you will have to enter the subject of your prompt. Let's stick with cheese eating burger.

Depending on your choice, you will get the following text prompts as a result:

Realistic Style Prompt :

Create a realistic photo of a cheese eating a burger in the Swiss Alps during golden hour.
The weather should be partly cloudy. Use a DSLR camera with a Wide Angle lens at 16mm.
Set the aperture to f/11, ISO to 100, and shutter speed to 1/60.
The lighting should be Natural with a Rule of Thirds composition.
Capture the scene from a low angle view. Include rocky terrain, alpine flowers in the foreground,
and snow-capped peaks, dramatic clouds in the background.
Use a Vibrant color scheme and render at 2048x2048 resolution.
In post-processing, apply HDR tone mapping and clarity enhancement.

Salin selepas log masuk

Digital Art Style Prompt :

Create a digital art piece featuring a cheese eating a burger in a Cyberpunk style.
Use a color palette of neon blue, electric purple, acid green, deep black. Apply the Digital Pen technique
on a canvas of 1920x1080 at 2048x2048 resolution.
Use 15 layers and follow the Leading Lines composition rule.
Render the scene from a bird's eye view with volumetric fog with light shafts lighting.
Include texture details like grungy surfaces and holographic reflections.
The background should depict a dystopian megacity.
In the foreground, feature flying vehicles, towering skyscrapers, neon signs.
The overall mood should be gritty, high-tech, atmospheric.
Draw inspiration from Blade Runner and Ghost in the Shell.
Finally, apply digital effects including bloom effect, chromatic aberration, film grain.

Salin selepas log masuk

Copy the previous commands and then paste them into ChatGPT. ChatGPT will use the DALL-E 3 model to generate the images.

Realistic Image Prompt Result

Digital Art Image Prompt Result

Mastering the Builder Pattern: Create a Dynamic AI Prompt Generator CLI

Mastering the Builder Pattern: Create a Dynamic AI Prompt Generator CLI

Remember the prompt parameters' complexity and the expertise needed to construct each type of prompt, not to mention the ugly constructor calls which are needed.

this.prompt = new RealisticPhotoPrompt(
  "",
  "",
  "",
  "",
  CameraType.DSLR,
  LensType.WideAngle,
  24,
  "f/8",
  100,
  "1/125",
  LightingCondition.Natural,
  CompositionRule.RuleOfThirds,
  "",
  [],
  [],
  ColorScheme.Vibrant,
  ImageResolution.Medium,
  []
)

Salin selepas log masuk

Disclaimer: This ugly constructor call is not a big issue in JavaScript because we can pass a configuration object with all the properties being nullable.

To abstract the process of building the prompt and make our code open for extension and closed for modification (O in SOLID), and to make using our prompt generation library seamless or easier for our library clients, we will be opting to implement the builder design pattern.

Let's start by declaring the generic prompt builder interface.

The interface declares a bunch of methods:

  1. buildBaseProperties , buildTechnicalDetails , and buildArtisticElements are the steps for constructing either a Realistic or Digital Art prompt.
  2. setSubject is a shared method between all of our prompt builders; it's self-explanatory and will be used to set the prompt subject.

builders.ts

interface PromptBuilder {
  buildBaseProperties(): this
  buildTechnicalDetails(): this
  buildArtisticElements(): this
  setSubject(subject: string): this
}


class RealisticPhotoPromptBuilder implements PromptBuilder {
  private prompt: RealisticPhotoPrompt

  constructor() {
    this.reset()
  }

  private reset(): void {
    this.prompt = new RealisticPhotoPrompt(
      "",
      "",
      "",
      "",
      CameraType.DSLR,
      LensType.WideAngle,
      24,
      "f/8",
      100,
      "1/125",
      LightingCondition.Natural,
      CompositionRule.RuleOfThirds,
      "",
      [],
      [],
      ColorScheme.Vibrant,
      ImageResolution.Medium,
      []
    )
  }

  setSubject(subject: string): this {
    this.prompt.subject = subject
    return this
  }

  buildBaseProperties(): this {
    this.prompt.location = "Swiss Alps"
    this.prompt.timeOfDay = "golden hour"
    this.prompt.weather = "partly cloudy"
    return this
  }

  buildTechnicalDetails(): this {
    this.prompt.camera = CameraType.DSLR
    this.prompt.lens = LensType.WideAngle
    this.prompt.focalLength = 16
    this.prompt.aperture = "f/11"
    this.prompt.iso = 100
    this.prompt.shutterSpeed = "1/60"
    this.prompt.lighting = LightingCondition.Natural
    this.prompt.resolution = ImageResolution.High
    return this
  }

  buildArtisticElements(): this {
    this.prompt.composition = CompositionRule.RuleOfThirds
    this.prompt.perspective = "low angle view"
    this.prompt.foregroundElements = ["rocky terrain", "alpine flowers"]
    this.prompt.backgroundElements = ["snow-capped peaks", "dramatic clouds"]
    this.prompt.colorScheme = ColorScheme.Vibrant
    this.prompt.postProcessing = ["HDR tone mapping", "clarity enhancement"]
    return this
  }

  build(): RealisticPhotoPrompt {
    const result = Object.assign({}, this.prompt)
    this.reset()
    return result
  }
}

Salin selepas log masuk

builders.ts

class DigitalArtPromptBuilder implements PromptBuilder {
  private prompt: DigitalArtPrompt

  constructor() {
    this.reset()
  }

  private reset(): void {
    this.prompt = new DigitalArtPrompt(
      "",
      ArtStyle.Impressionist,
      [],
      BrushTechnique.Impasto,
      { width: 1920, height: 1080, resolution: ImageResolution.Medium },
      10,
      CompositionRule.GoldenRatio,
      "",
      "",
      [],
      "",
      [],
      [],
      [],
      []
    )
  }

  setSubject(subject: string): this {
    this.prompt.subject = subject
    return this
  }

  buildBaseProperties(): this {
    this.prompt.artStyle = ArtStyle.Cyberpunk
    this.prompt.colorPalette = [
      "neon blue",
      "electric purple",
      "acid green",
      "deep black",
    ]
    this.prompt.canvas.resolution = ImageResolution.High
    return this
  }

  buildTechnicalDetails(): this {
    this.prompt.brushTechnique = BrushTechnique.DigitalPen
    this.prompt.layers = 15
    this.prompt.composition = CompositionRule.LeadingLines
    this.prompt.perspective = "bird's eye view"
    this.prompt.lightingEffect = "volumetric fog with light shafts"
    return this
  }

  buildArtisticElements(): this {
    this.prompt.textureDetails = ["grungy surfaces", "holographic reflections"]
    this.prompt.backgroundTheme = "dystopian megacity"
    this.prompt.foregroundElements = [
      "flying vehicles",
      "towering skyscrapers",
      "neon signs",
    ]
    this.prompt.moodKeywords = ["gritty", "high-tech", "atmospheric"]
    this.prompt.artisticInfluences = ["Blade Runner", "Ghost in the Shell"]
    this.prompt.digitalEffects = [
      "bloom effect",
      "chromatic aberration",
      "film grain",
    ]
    return this
  }

  build(): DigitalArtPrompt {
    const result = Object.assign({}, this.prompt)
    this.reset()
    return result
  }
}

Salin selepas log masuk

As you can see from the implementations above, each builder chooses to build its own kind of prompt (the final prompt shapes are different) while sticking to the same building steps defined by the PromptBuilder contract!

Now, let's move on to our Director class definition.

director.ts

import { PromptBuilder } from "./builders"

export class PromptDirector {
  private builder: PromptBuilder

  setBuilder(builder: PromptBuilder): void {
    this.builder = builder
  }

  makePrompt(subject: string): void {
    this.builder
      .setSubject(subject)
      .buildBaseProperties()
      .buildTechnicalDetails()
      .buildArtisticElements()
  }
}

Salin selepas log masuk

The Director class wraps a PromptBuilder and allows us to create a prompt configuration which consists of calling all the builder methods starting from setSubject to buildArtisticElements.

This will simplify our client code in the index.ts file, which we will see in the next section.

serializers.ts

import { DigitalArtPrompt, RealisticPhotoPrompt } from "./prompts"

// Serialization functions
export function serializeRealisticPhotoPrompt(
  prompt: RealisticPhotoPrompt
): string {
  return `Create a realistic photo of ${prompt.subject} in the ${prompt.location} during ${prompt.timeOfDay}. 
The weather should be ${prompt.weather}. Use a ${prompt.camera} camera with a ${prompt.lens} lens at ${prompt.focalLength}mm. 
Set the aperture to ${prompt.aperture}, ISO to ${prompt.iso}, and shutter speed to ${prompt.shutterSpeed}. 
The lighting should be ${prompt.lighting} with a ${prompt.composition} composition. 
Capture the scene from a ${prompt.perspective}. Include ${prompt.foregroundElements.join(", ")} in the foreground, 
and ${prompt.backgroundElements.join(", ")} in the background. 
Use a ${prompt.colorScheme} color scheme and render at ${prompt.resolution} resolution. 
In post-processing, apply ${prompt.postProcessing.join(" and ")}.`
}

export function serializeDigitalArtPrompt(prompt: DigitalArtPrompt): string {
  return `Create a digital art piece featuring ${prompt.subject} in a ${prompt.artStyle} style. 
Use a color palette of ${prompt.colorPalette.join(", ")}. Apply the ${prompt.brushTechnique} technique 
on a canvas of ${prompt.canvas.width}x${prompt.canvas.height} at ${prompt.canvas.resolution} resolution. 
Use ${prompt.layers} layers and follow the ${prompt.composition} composition rule. 
Render the scene from a ${prompt.perspective} with ${prompt.lightingEffect} lighting. 
Include texture details like ${prompt.textureDetails.join(" and ")}. 
The background should depict a ${prompt.backgroundTheme}. 
In the foreground, feature ${prompt.foregroundElements.join(", ")}. 
The overall mood should be ${prompt.moodKeywords.join(", ")}. 
Draw inspiration from ${prompt.artisticInfluences.join(" and ")}. 
Finally, apply digital effects including ${prompt.digitalEffects.join(", ")}.`
}

Salin selepas log masuk

To print the final prompt text to the terminal console, I've implemented some utility serialization functions.

Now our prompt library generation code is ready. Let's make use of it in the index.ts file.

index.ts

import inquirer from "inquirer"

import {
  DigitalArtPromptBuilder,
  PromptBuilder,
  RealisticPhotoPromptBuilder,
} from "./builders"
import { PromptDirector } from "./director"
import { DigitalArtPrompt, RealisticPhotoPrompt } from "./prompts"
import {
  serializeDigitalArtPrompt,
  serializeRealisticPhotoPrompt,
} from "./serializers"

async function main() {
  console.log("=====================")
  console.log("Image Prompt Builder")
  console.log("=====================")

  const director = new PromptDirector()

  let builder: PromptBuilder
  let prompt: RealisticPhotoPrompt | DigitalArtPrompt
  const { choice, subject } = await getUserInput()

  if (choice === "Realistic Photo") {
    builder = new RealisticPhotoPromptBuilder()
    director.setBuilder(builder)
    director.makePrompt(subject)
    prompt = builder.build() as RealisticPhotoPrompt
    console.log("\nGenerated Prompt:")
    console.log(serializeRealisticPhotoPrompt(prompt))
  } else {
    builder = new DigitalArtPromptBuilder()
    director.setBuilder(builder)
    director.makePrompt(subject)
    prompt = builder.build() as DigitalArtPrompt
    console.log("\nGenerated Prompt:")
    console.log(serializeDigitalArtPrompt(prompt))
  }
}

main().catch(console.error)

// get user input function
async function getUserInput() {
  const input = await inquirer.prompt([
    {
      type: "list",
      name: "choice",
      message: "Choose prompt type:",
      choices: ["Realistic Photo", "Digital Art"],
    },
    {
      type: "input",
      name: "subject",
      message: "Enter the subject for your image:",
    },
  ])

  return input
}

Salin selepas log masuk

The code above performs the following actions:

  1. Prompt the user to select a prompt style and then a subject using the inquirer package: getUserInput.
  2. After getting both the subject and the art style from the user, the client code uses only two components from our library: The PromptBuilder and the Director.
  3. We start by instantiating the Director.
  4. Then, depending on the selected prompt style, we instantiate the corresponding builder and set it to the Director class.
  5. Finally, we call the director.makePrompt method with the chosen subject as an argument, get the prompt from the builder , and print the serialized prompt to the terminal console.

Remember: it's not possible to get the prompt from the director because the shape of the prompt produced by each builder type is different.

Conclusion

The Builder design pattern proves to be an excellent solution for creating complex objects with multiple configurations, as demonstrated in our AI image prompt generation CLI application. Here's why the Builder pattern was beneficial in this scenario:

  1. Simplified Object Creation : The pattern allowed us to create intricate RealisticPhotoPrompt and DigitalArtPrompt objects without exposing their complex construction process to the client code.

  2. Flexibility : By using separate builder classes for each prompt type, we could easily add new prompt types or modify existing ones without changing the client code.

  3. Code Organization : The pattern helped separate the construction logic from the representation, making the code more modular and easier to maintain.

  4. Reusability : The PromptDirector class allowed us to reuse the same construction process for different types of prompts, enhancing code reusability.

  5. Abstraksi : Kod pelanggan dalam index.ts kekal mudah dan tertumpu pada logik peringkat tinggi, manakala kerumitan pembinaan segera telah diabstraksikan dalam kelas pembina.

Kenalan

Jika anda mempunyai sebarang soalan atau ingin membincangkan sesuatu dengan lebih lanjut, sila hubungi saya di sini.

Selamat pengekodan!

Atas ialah kandungan terperinci Menguasai Corak Pembina: Cipta CLI Penjana Gesaan AI Dinamik. Untuk maklumat lanjut, sila ikut artikel berkaitan lain di laman web China PHP!

sumber:dev.to
Kenyataan Laman Web ini
Kandungan artikel ini disumbangkan secara sukarela oleh netizen, dan hak cipta adalah milik pengarang asal. Laman web ini tidak memikul tanggungjawab undang-undang yang sepadan. Jika anda menemui sebarang kandungan yang disyaki plagiarisme atau pelanggaran, sila hubungi admin@php.cn
Artikel terbaru oleh pengarang
Tutorial Popular
Lagi>
Muat turun terkini
Lagi>
kesan web
Kod sumber laman web
Bahan laman web
Templat hujung hadapan