Llama 2 (our choice)

Llama 2: Overview¶

Llama 2 is an open-source language model developed by Meta, designed for advanced text processing tasks. It is available in various sizes (e.g., 7B, 13B, 70B parameters) and is suitable for both online and offline applications.

1. Technical Specifications¶

Model sizes:
- Available in 7B, 13B, and 70B (B = billion parameters).
- Llama 2 (7B): Ideal for devices with limited memory (approx. 8–12 GB RAM).
- Llama 2 (13B): Delivers higher text quality but requires more hardware (~16 GB RAM).
Optimization:
- Supports quantization (e.g., 4-bit), significantly reducing memory requirements.
Efficiency:
- Designed for use on standard hardware, without requiring high-end GPUs.

2. Advantages of Llama 2¶

Offline capability:
- Can be executed fully locally without needing an internet connection.
Open-source license:
- No ongoing costs for usage or customization.
Flexibility:
- Can be trained or fine-tuned using custom data (e.g., game lore, character info).
Hardware friendly:
- Runs efficiently on moderately powerful systems, especially the 7B model.
Customizable:
- Designed for developers who want full control over model behavior and integration.

3. Disadvantages of Llama 2¶

Quality limitations:
- Text generation quality is good but does not match cutting-edge cloud models like GPT-4.
Technical effort:
- Setup and training require technical knowledge and experience.
Hardware demands:
- Larger models (13B and 70B) require significantly more RAM and compute resources.
Integration complexity:
- Requires additional tooling (e.g., via Python or plugin) to be used within a game engine like Godot.

4. Use in Our Game¶

Why Llama 2 is suitable for our point-and-click adventure:

Offline dialogue systems:
- Dynamic AI dialogues can run entirely locally, ideal for a game that should work without internet.
Efficient and lightweight:
- The 7B model can run on standard PCs, allowing players to enjoy the game without needing high-end hardware.
Custom lore integration:
- We can train Llama 2 with our game world, characters, and specific dialogue styles to generate immersive and personalized responses.
Cosmic horror atmosphere:
- Llama 2 can be fine-tuned to enhance the dark and unsettling tone of the story through stylized and eerie dialogue.

5. Integrating Llama 2¶

Tools like Llama.cpp:
- Llama.cpp is a framework that enables efficient local execution of Llama 2.
- It supports quantization, allowing the model to run on lower-end hardware.
Plugins or scripts:
- For Godot integration, a Python interface can be used to connect the model with the in-game dialogue system.

6. Comparison to Other Models¶

Compared to GPT-J and GPT-NeoX:
- Llama 2 (7B) is more efficient and hardware-friendly, though slightly less capable in complex text generation.
- Larger Llama 2 models (13B, 70B) can offer quality on par with or better than GPT-NeoX, while requiring fewer resources.
Compared to GPT-4 (API):
- Llama 2 is free and offline-capable, but its generative quality is not on the same level as GPT-4.

Conclusion¶

Llama 2 is an excellent choice for our game if we want to:

Implement offline dialogue systems.
Keep hardware requirements low.
Use open-source solutions that incur no ongoing costs.

It is especially useful if we’re willing to invest some time into integration and fine-tuning to tailor the model to our game world.