Getting Started with PLGL

Build personalized AI applications using preference learning and generative models

Introduction

PLGL (Preference Learning in Generative Latent Spaces) enables you to create AI applications that learn and adapt to individual user preferences. This guide will walk you through everything you need to start building with PLGL.

What You'll Learn

  • How to set up PLGL in your project
  • Understanding the core concepts
  • Building your first preference-learning application
  • Best practices and optimization techniques

Understanding PLGL

Important Note

PLGL is a methodology and approach, not a specific software package. There is no "pip install plgl" command. Instead, PLGL describes how to use preference learning with any generative model that has a latent space.

What You Need

To implement PLGL, you'll need:

  • A generative model with a latent space (e.g., StyleGAN, Stable Diffusion, VAE)
  • A machine learning framework (PyTorch, TensorFlow, JAX, etc.)
  • A way to collect user preferences (UI for ratings)

Reference Implementation

For the original 2018-2019 implementation approach, see:

Requirements

  • Python 3.8+
  • A deep learning framework (PyTorch, TensorFlow, or JAX)
  • A pre-trained generative model with accessible latent space

Quick Start

Here's the conceptual flow of implementing PLGL:

The PLGL Process

  1. Generate Sample Content

    Use your generative model to create diverse samples by sampling different points in the latent space.

  2. Collect User Feedback

    Present samples to users and collect simple feedback (like/dislike, ratings, or implicit signals).

  3. Train Preference Classifier

    Build a model that learns to predict user preferences based on latent space coordinates.

  4. Find Optimal Region

    Use reverse classification to find latent vectors that maximize preference scores.

  5. Generate Personalized Content

    Create new content using the optimal latent vectors.

Example Workflow

If you're using StyleGAN for face generation:

  1. Generate 20 random faces
  2. User swipes left/right (dislike/like)
  3. Train a neural network: latent vector โ†’ preference score
  4. Use gradient ascent to find high-scoring latent vectors
  5. Generate faces the user will love

Core Concepts

1. Generative Models & Latent Spaces

PLGL works with any generative model that has a continuous latent space. This includes:

  • GANs (StyleGAN, ProGAN, etc.)
  • VAEs (Variational Autoencoders)
  • Diffusion Models with latent representations
  • Any model where inputs map to outputs through a continuous space

2. Preference Learning

PLGL learns preferences through various rating mechanisms:

  • Binary preferences: Like/dislike, thumbs up/down
  • Scaled preferences: 1-5 star ratings
  • Comparative preferences: Choose A or B
  • Implicit signals: Watch time, skip behavior, engagement

The key insight: Users don't need to describe what they want. They just need to react to what they see, and PLGL learns the patterns.

3. Latent Space Navigation

PLGL optimizes through the latent space to find content matching learned preferences:

  • Gradient-based optimization: Use gradients to climb toward high-preference regions
  • Evolutionary optimization: For non-differentiable models
  • Bayesian optimization: Sample-efficient exploration
  • Distribution generation: Create variety while maintaining quality

Reverse Classification: Instead of asking "what score does this content get?", PLGL asks "what latent vector gives a perfect score?"

Your First PLGL Project

Let's build a complete preference-learning application step by step.

Project: Personalized Art Generator

Step 1: Setup Your Generative Model

First, you'll need to load your chosen generative model. Here's how the original PLGL implementation loads StyleGAN:

# Load pre-trained StyleGAN model
import pickle
import dnnlib
import dnnlib.tflib as tflib

tflib.init_tf()

# Load the model (example using StyleGAN for faces)
url = 'https://drive.google.com/uc?id=1MEGjdvVpUsu1jB4zrXZN7Y4kBBOzizDQ'
with dnnlib.util.open_url(url, cache_dir=config.cache_dir) as f:
    _G, _D, Gs = pickle.load(f)
    # Gs = Long-term average of the generator (best quality)

The key requirement is that your model can generate content from latent vectors (512-dimensional in StyleGAN's case).

Step 2: Build a Rating Interface

Create a simple interface for users to rate generated content. This typically involves:

  • Display function: Show the generated content to the user
  • Rating collection: Capture user feedback (1-5 stars, thumbs up/down, etc.)
  • Data storage: Save latent vectors with their ratings

The interface can be as simple as a command-line prompt or as sophisticated as a mobile app with swipe gestures.

Step 3: Collect User Preferences

Generate diverse samples and collect user ratings. Here's how the real implementation loads preference data:

# Load user preference data from database
import sqlite3
import numpy as np

# Connect to preference database
conn = sqlite3.connect("preferences.db")
c = conn.cursor()
c.execute("SELECT * FROM user_ratings")
rows = c.fetchall()

# Build dataset of latent vectors and ratings
latent_vectors = []
ratings = []
for row in rows:
    # Load the latent vector that generated this image
    latents = np.load(f"results/{row[1]}.npy")
    latent_vectors.append(latents)
    ratings.append(int(row[2]))  # Binary rating: 0 or 1

# Convert to numpy arrays
X = np.vstack(latent_vectors)  # Shape: (n_samples, 512)
y = np.array(ratings)           # Shape: (n_samples,)

Key insight: Each generated image is stored with its latent vector, allowing us to map preferences back to the latent space.

Step 4: Train a Preference Classifier

The original PLGL implementation offers two approaches:

Option A: Neural Network Classifier
# Define simple neural network architecture
NN_ARCHITECTURE = [
    {"input_dim": 512, "output_dim": 1, "activation": "sigmoid"}
]

# Train the preference model
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# Custom training function (see full implementation)
params = train(X_train.T, y_train.reshape(-1, 1).T, 
               NN_ARCHITECTURE, epochs=100, learning_rate=0.01)

# Test accuracy
Y_test_hat, _ = full_forward_propagation(X_test.T, params, NN_ARCHITECTURE)
accuracy = get_accuracy_value(Y_test_hat, y_test.reshape(-1, 1).T)
print(f"Test accuracy: {accuracy:.2f}")
Option B: SVM Classifier (Faster)
# Using SVM for fast preference learning
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler

# Normalize the data
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Train SVM classifier
classifier = SVC(C=10, gamma=0.0001, kernel='rbf')
classifier.fit(X_train_scaled, y_train)

# Fast prediction on new samples
score = classifier.score(X_test_scaled, y_test)
print(f"SVM accuracy: {score:.2f}")

Step 5: Generate Personalized Content

The most innovative part: Reverse Classificationโ„ข

# Reverse Classification: Find latent vector for desired score
def reverse_classification(target_score, params, nn_architecture):
    """Given a target preference score, find the latent vector"""
    output = np.zeros((512,), dtype="float32")
    
    # Extract trained weights and bias
    W = params["W1"][0]  # Shape: (512,)
    b = params["b1"][0]
    
    # Inverse sigmoid to get pre-activation value
    z = math.log(target_score / (1 - target_score)) - b[0]
    
    # Distribute across latent dimensions
    for i in range(512):
        y = z / W[i]
        if abs(y) >= 1.0:
            output[i] = np.sign(y) * 0.99999
            z -= W[i] * output[i]
        else:
            output[i] = y
            break
    
    return output

# Generate ideal content (99.99% preference score)
ideal_latent = reverse_classification(0.9999, params, NN_ARCHITECTURE)

# Generate the image
images = Gs.run(ideal_latent.reshape(1, -1), None, 
                truncation_psi=0.7, randomize_noise=True)
                
# Save personalized result
PIL.Image.fromarray(images[0], 'RGB').save('personalized_result.png')

This is the key innovation: Instead of searching randomly, we directly compute what latent vector will produce the desired preference score!

Step 6: Iterative Refinement (Optional)

Continuously improve the preference model with batch generation:

# GPU-optimized batch generation
def generate_batch(classifier, generator, n_samples=100):
    """Generate a batch with 70% exploitation, 30% exploration"""
    batch_latents = []
    
    # 70% exploitation: refine around high-scoring regions
    for _ in range(70):
        # Start from a known good point and add small noise
        good_latent = reverse_classification(0.95, params, NN_ARCHITECTURE)
        noise = np.random.normal(0, 0.1, size=(512,))
        batch_latents.append(good_latent + noise)
    
    # 30% exploration: discover new preferences
    for _ in range(30):
        random_latent = np.random.randn(512)
        batch_latents.append(random_latent)
    
    # Generate all images in one GPU batch
    batch_latents = np.array(batch_latents)
    images = Gs.run(batch_latents, None, truncation_psi=0.7)
    
    return images, batch_latents

Implementation Notes from Original Code:

  • Database: SQLite for storing latent vectors with ratings
  • Latent dimension: 512 for StyleGAN faces
  • Binary classification: Simple hot/not ratings (0 or 1)
  • Fast training: 100-1000 epochs typically sufficient
  • Caching: Pre-generate and cache common samples

Complete Working Example

Here's how all the pieces come together in a real PLGL implementation:

# Complete PLGL Pipeline Example
import numpy as np
import sqlite3
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# 1. LOAD YOUR GENERATIVE MODEL
# (Using StyleGAN as example - adapt for your model)
import dnnlib.tflib as tflib
tflib.init_tf()
# ... load your model here ...

# 2. COLLECT PREFERENCES FROM DATABASE
conn = sqlite3.connect("preferences.db")
c = conn.cursor()
c.execute("SELECT latent_vector_id, rating FROM user_preferences")
preferences = c.fetchall()

# Load latent vectors and ratings
X = []  # Latent vectors
y = []  # Ratings
for latent_id, rating in preferences:
    latent = np.load(f"latents/{latent_id}.npy")
    X.append(latent)
    y.append(rating)

X = np.array(X)
y = np.array(y)

# 3. TRAIN PREFERENCE CLASSIFIER (SVM for speed)
from sklearn.svm import SVC

# Split and scale data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Train fast SVM classifier
classifier = SVC(kernel='rbf', C=10, gamma=0.0001)
classifier.fit(X_train_scaled, y_train)
print(f"Accuracy: {classifier.score(X_test_scaled, y_test):.2f}")

# 4. GENERATE PERSONALIZED CONTENT
def find_optimal_latent(classifier, scaler, n_iterations=1000):
    """Find latent vector that maximizes preference score"""
    best_latent = None
    best_score = -1
    
    for _ in range(n_iterations):
        # Random initialization
        latent = np.random.randn(512)
        
        # Simple hill climbing
        for step in range(100):
            # Add small random perturbations
            perturbation = np.random.randn(512) * 0.01
            new_latent = latent + perturbation
            
            # Check if this improves the score
            score = classifier.decision_function(
                scaler.transform(new_latent.reshape(1, -1))
            )[0]
            
            if score > best_score:
                best_score = score
                best_latent = new_latent
                latent = new_latent
    
    return best_latent

# Find and generate optimal content
optimal_latent = find_optimal_latent(classifier, scaler)
# optimal_image = generator.generate(optimal_latent)

print("Found optimal latent vector with preference score:", 
      classifier.predict_proba(
          scaler.transform(optimal_latent.reshape(1, -1))
      )[0][1])

Important Implementation Details

  • Balance your dataset: Include negative examples to prevent blind spots
  • Use caching: Pre-generate common samples for instant responses
  • Batch processing: Generate 100+ samples at once for GPU efficiency
  • Simple feedback: Binary ratings work better than complex scales

Advanced Usage

Multi-Modal Preferences

PLGL can work across multiple modalities simultaneously:

  • Visual + Text: Learn preferences for images with captions
  • Audio + Visual: Match music preferences with visual styles
  • Cross-modal transfer: Apply preferences from one domain to another

The key is to have a shared latent space or a way to map between different latent spaces.

Conditional Generation

Generate content based on both preferences and conditions:

  • Context-aware: Different preferences for different moods, times, or situations
  • Multi-user: Switch between different user preference models
  • Hybrid approach: Combine prompts with preference learning

This allows for more nuanced personalization that adapts to changing contexts.

Real-Time Adaptation

PLGL can adapt in real-time using efficient classifiers like SVMs:

  • Fast updates: SVM-style classifiers can be retrained quickly
  • Online learning: Update preferences with each interaction
  • Implicit feedback: Learn from engagement time, skips, or other signals
  • Cached content: Use pre-generated samples for instant response

The lightweight nature of the preference classifier enables true real-time personalization.

Distributed Learning

PLGL can be implemented in privacy-preserving ways:

  • Local training: Each user's preference model stays on their device
  • Federated learning: Share model updates, not personal data
  • Differential privacy: Add noise to protect individual preferences
  • Zero-knowledge: Match preferences without revealing them

This enables personalization at scale while respecting user privacy.

Example Projects

๐ŸŽต Music Preference Learning

Generate personalized music using MusicVAE

View Code โ†’

๐Ÿ—๏ธ Architecture Design

Create buildings matching style preferences

View Code โ†’

๐Ÿงฌ Molecule Discovery

Design molecules with desired properties

View Code โ†’

๐Ÿ“š Story Generation

Create narratives matching reading preferences

View Code โ†’

๐Ÿ‘— Fashion Design

Generate clothing matching personal style

View Code โ†’

๐ŸŽฎ Game Level Generation

Create levels matching player preferences

View Code โ†’

Troubleshooting

Common Issues

Preference model not converging

Solution: Try these approaches:

  • Collect more diverse preferences: Use maximum diversity sampling to cover the latent space better
  • Use a simpler model: Start with a linear SVM classifier before trying deep networks
  • Adjust learning rate: Try smaller learning rates (0.0001 or lower)
  • Check data quality: Look for contradictory ratings or insufficient variety
  • Balance your dataset: Include pre-marked negative examples (e.g., inappropriate content)

Generated content lacks diversity

Solution: Increase exploration:

  • Temperature sampling: Add noise to latent vectors for more variety
  • Multiple starting points: Optimize from different random initializations
  • Batch generation: Use 70% exploitation (refining known good areas) and 30% exploration
  • Minimum distance constraints: Ensure generated samples aren't too similar

Memory or performance issues

Solution: Use efficient techniques:

  • GPU batching: Generate content in batches for efficient GPU utilization
  • Cached content: Reuse pre-generated samples, especially for initial training
  • SVM classifiers: Use fast, lightweight classifiers that can handle large datasets
  • Mixed caching strategy: Combine fresh generations with cached positive/negative examples
  • Gradient checkpointing: Trade computation for memory when using large models