PLGL: Preference Learning in
Generative Latent Spaces

Transform user preferences into personalized AI-generated content across any domain

2018

Pioneered

2019

Public Demo

2025

Open Sourced

โˆž

Possibilities

Tinder doesn't ask you to describe your perfect match

PLGL is a revolutionary approach to personalized content discovery and generation

๐ŸŽง

Listen to How PLGL Works and its Possibilities

Prefer to listen? Learn about PLGL technology in this quick audio overview.

1

Rate Samples

Users rate AI-generated content based on their preferences

๐Ÿ‘/๐Ÿ‘Ž
โ†’
2

Learn Preferences

Build personalized ML models from rating patterns

๐Ÿง 
โ†’
3

Navigate Latent Space

Optimize through generative model's latent dimensions

๐ŸŽฏ
โ†’
4

Generate Ideal Content

Create perfectly personalized results

โœจ

Infinite Applications

PLGL works with any generative model that has a latent space

๐ŸŽต

Music Generation

Create infinite personalized playlists and compositions

  • Spotify-style "Made for You" generation
  • Game soundtrack adaptation
  • Mood-based composition
See Example โ†’
๐ŸŽจ

Art & Design

Generate artwork matching personal aesthetic preferences

  • Logo design automation
  • NFT collection generation
  • Interior design concepts
See Example โ†’
๐Ÿงฌ

Drug Discovery

Design molecules with desired properties

  • Optimize for bioavailability
  • Minimize side effects
  • Target specific proteins
See Example โ†’
๐Ÿ—๏ธ

Architecture

Design buildings matching lifestyle preferences

  • Residential layouts
  • Commercial facades
  • Urban planning
See Example โ†’
๐Ÿ“š

Story Generation

Create narratives matching reading preferences

  • Personalized novels
  • Interactive fiction
  • Educational content
See Example โ†’
๐Ÿ‘—

Fashion Design

Generate clothing matching personal style

  • Custom garment design
  • Outfit recommendations
  • Trend prediction
See Example โ†’
๐Ÿ”ฌ

Material Science

Design materials with optimal properties

  • Strength optimization
  • Conductivity tuning
  • Sustainability focus
See Example โ†’
๐ŸŽฎ

Game Design

Create levels matching player preferences

  • Difficulty adaptation
  • Style preferences
  • Reward optimization
See Example โ†’
๐Ÿ’•

Private Dating

Match without sharing photos - AI understands your type

"Perfection is just a few swipes away"

  • Train on generated faces
  • Match via latent preferences
  • Privacy-first dating
See Example โ†’
๐Ÿ“ฑ

Zero-Prompt Social Media

Like TikTok for AI content - just swipe, no prompting

  • Personalized video streams
  • Custom music feeds
  • Infinite fresh content
See Example โ†’
๐Ÿ“ฐ

News & Content Curation

Transform headlines and stories to match reader preferences

  • Personalized headlines
  • Adaptive storytelling
  • Interest-based ranking
See Example โ†’
๐Ÿ’„

Beauty & Makeup

See your ideal look and how to achieve it

  • Personalized transformations
  • Product recommendations
  • Style optimization
See Example โ†’
๐Ÿš—

Automotive Design

Design your perfect car through preferences

  • Custom car generation
  • Style preferences
  • Feature optimization
See Example โ†’
๐Ÿงฌ

DNA & Genetics

Design genetic sequences for desired traits

  • Trait optimization
  • Gene expression control
  • Synthetic biology
See Example โ†’
โž•

Your Application

PLGL works with any generative model

  • 3D model generation
  • Recipe creation
  • Any latent space!
Build Your Own โ†’

Why PLGL Matters

Revolutionary technology now open-sourced for the community

2018

Pioneering Innovation

Developed when generative models were nascent. We solved preference learning before it was recognized as a problem.

1000x

Faster Than Search

Reverse Classificationโ„ข directly computes optimal latent vectors. No brute force searching through millions of possibilities.

10ms

Real-Time Updates

SVM-based preference learning updates instantly. Neural networks take minutes - we take milliseconds.

Open

Community Driven

Now open-sourced under MIT license. Build on our foundation and create amazing experiences.

Privacy

On-Device Learning

Runs entirely on-device. Your preferences never leave your machine. True privacy by design.

โˆž

Universal Application

Works with ANY generative model with a latent space. One technology, infinite possibilities.

Join the Revolution

PLGL is now open source. Help us transform how humans interact with AI.

Get Started View on GitHub Read Whitepaper

The Origin Story

How SkinDeep.ai Inc pioneered PLGL technology in 2018-2019

First Implementation of Preference Learning in Latent Spaces

"Perfection is just a few swipes away"

In 2018-2019, SkinDeep.ai Inc developed the groundbreaking PLGL (Preference Learning in Generative Latent Spaces) technology. Our original app demonstrated how user ratings could train personalized classifiers to navigate StyleGAN's latent space, creating ideal personalized content without any prompting.

๐Ÿง 

Learned Individual Preferences

Built unique preference models for each user based on simple ratings

๐ŸŽฏ

Reverse Classification

Used trained classifiers to find optimal points in StyleGAN's latent space

๐Ÿ“Š

Distribution Generation

Created diverse samples for iterative preference refinement

Patent Filed: SkinDeep.ai Inc filed a provisional patent in 2019 for the PLGL methodology. We've now open sourced this technology for community benefit and widespread adoption.

Original Demo Videos & Patent (2019)

SkinDeep.ai Video Demo

See the original app in action and learn about the core concepts

โ–ถ Watch Demo

Technical Deep Dive

Detailed explanation of the preference learning algorithm

โ–ถ Watch Video

Provisional Patent

Read the original patent filing describing the technology and applications

๐Ÿ“„ View Patent

Original Source Code

Archive of the original SkinDeep.ai repositories (2018-2019)

Interactive Demos

Experience PLGL in action

๐ŸŽฎ Try It Yourself - Interactive Preference Learning

Click or tap to show what you like (๐Ÿ‘) and dislike (๐Ÿ‘Ž). Watch how the AI learns your preferences and finds your ideal spot!

How this demo works: This is a simplified 2D visualization of PLGL's latent space exploration. In real applications, PLGL works with high-dimensional spaces (e.g., 512 dimensions for image generation). When the optimal region is found, PLGL balances:
โ€ข Exploitation: Generating samples near the optimal region to refine preferences
โ€ข Exploration: Testing uncertain areas to find other potential maxima and improve model stability
This ensures diverse outputs while continuously learning your true preferences.

Likes: 0 Dislikes: 0 Confidence: 0%

๐Ÿ“Š Automated Learning Demo

Watch how PLGL automatically explores and learns preferences

Batch: 0 Samples: 0 Accuracy: 0%

๐ŸŽฅ Original StyleGAN Demo

See the original 2019 SkinDeep.ai app that pioneered PLGL technology

๐Ÿ“น

Watch how the original app used StyleGAN to find users' ideal faces through simple swipe interactions

โ–ถ๏ธ Watch Demo Video
๐Ÿ“š Technical Deep Dive โ†’

Implementation Examples

import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset

class PLGLPyTorch:
    """
    PyTorch-idiomatic PLGL implementation
    
    While the original PLGL used SVM classifiers, this implementation shows how to
    achieve similar results using PyTorch-native components. The key insight is that
    a neural network with appropriate regularization can approximate SVM behavior.
    """
    
    def __init__(self, generator, latent_dim, device='cuda'):
        self.generator = generator.to(device)
        self.latent_dim = latent_dim
        self.device = device
        self.preference_model = None
        self.preference_data = {'latents': [], 'ratings': []}
        
    def collect_preferences(self, n_samples=100, user_rating_fn=None):
        """
        Collect user preferences on generated samples
        
        Args:
            n_samples: Number of samples to generate
            user_rating_fn: Function that displays content and returns rating (0 or 1)
                           If None, uses simulated ratings for demonstration
        """
        samples = []
        
        with torch.no_grad():
            for i in range(n_samples):
                # Sample from latent space
                z = torch.randn(1, self.latent_dim).to(self.device)
                
                # Generate content
                content = self.generator(z)
                
                # Get user rating
                if user_rating_fn:
                    rating = user_rating_fn(content)
                else:
                    # Simulated rating based on distance from origin
                    # (In real use, implement actual UI)
                    rating = 1 if torch.norm(z) < 1.5 else 0
                
                self.preference_data['latents'].append(z.cpu())
                self.preference_data['ratings'].append(rating)
                
                samples.append({
                    'latent': z,
                    'content': content,
                    'rating': rating
                })
                
        return samples
    
    def build_preference_model(self):
        """
        Build a neural network that mimics SVM behavior
        Uses strong L2 regularization and limited capacity to encourage
        smooth decision boundaries like an RBF kernel SVM
        """
        self.preference_model = nn.Sequential(
            nn.Linear(self.latent_dim, 64),
            nn.Tanh(),  # Bounded activation like SVM
            nn.Linear(64, 32),
            nn.Tanh(),
            nn.Linear(32, 1),
            nn.Sigmoid()
        ).to(self.device)
        
        return self.preference_model
    
    def train_preference_model(self, epochs=100, batch_size=32, lr=0.01, weight_decay=0.01):
        """
        Train the preference model with PyTorch optimization
        
        The weight_decay parameter provides L2 regularization similar to
        the C parameter in SVMs (higher weight_decay = lower C)
        """
        if not self.preference_data['latents']:
            raise ValueError("No preference data collected yet")
            
        # Prepare data
        X = torch.cat(self.preference_data['latents']).to(self.device)
        y = torch.tensor(self.preference_data['ratings']).float().to(self.device)
        
        # Create data loader
        dataset = TensorDataset(X.squeeze(1), y)
        loader = DataLoader(dataset, batch_size=batch_size, shuffle=True)
        
        # Initialize model and optimizer
        if self.preference_model is None:
            self.build_preference_model()
            
        optimizer = optim.Adam(
            self.preference_model.parameters(), 
            lr=lr, 
            weight_decay=weight_decay  # L2 regularization
        )
        criterion = nn.BCELoss()
        
        # Training loop
        self.preference_model.train()
        for epoch in range(epochs):
            total_loss = 0
            for batch_x, batch_y in loader:
                optimizer.zero_grad()
                
                pred = self.preference_model(batch_x).squeeze()
                loss = criterion(pred, batch_y)
                
                loss.backward()
                optimizer.step()
                
                total_loss += loss.item()
                
            if (epoch + 1) % 20 == 0:
                print(f"Epoch {epoch+1}/{epochs}, Loss: {total_loss/len(loader):.4f}")
    
    def find_optimal_latent(self, n_starts=10, n_steps=1000, lr=0.1):
        """
        Find optimal point in latent space using gradient ascent
        Multiple restarts ensure we find global optimum
        """
        if self.preference_model is None:
            raise ValueError("Train preference model first")
            
        best_z = None
        best_score = -float('inf')
        
        self.preference_model.eval()
        
        for start in range(n_starts):
            # Random initialization
            z = torch.randn(1, self.latent_dim, requires_grad=True, device=self.device)
            optimizer = optim.Adam([z], lr=lr)
            
            for step in range(n_steps):
                optimizer.zero_grad()
                
                # Forward pass through preference model
                score = self.preference_model(z)
                
                # Maximize score (minimize negative score)
                loss = -score
                loss.backward()
                optimizer.step()
                
                # Constrain to reasonable latent space region
                with torch.no_grad():
                    z.clamp_(-3, 3)
            
            # Check if this is the best result
            final_score = self.preference_model(z).item()
            if final_score > best_score:
                best_score = final_score
                best_z = z.detach().clone()
                
        # Generate content from optimal latent
        with torch.no_grad():
            optimal_content = self.generator(best_z)
            
        return optimal_content, best_z, best_score
    
    def generate_distribution(self, n_samples=100, threshold=0.7, temperature=1.0):
        """
        Generate samples from high-preference regions
        
        Args:
            n_samples: Number of samples to generate
            threshold: Minimum preference score to accept
            temperature: Controls diversity (higher = more diverse)
        """
        if self.preference_model is None:
            raise ValueError("Train preference model first")
            
        self.preference_model.eval()
        accepted_samples = []
        accepted_latents = []
        
        with torch.no_grad():
            attempts = 0
            while len(accepted_samples) < n_samples and attempts < n_samples * 100:
                # Sample with temperature
                z = torch.randn(1, self.latent_dim).to(self.device) * temperature
                
                # Evaluate preference
                score = self.preference_model(z).item()
                
                if score > threshold:
                    content = self.generator(z)
                    accepted_samples.append(content)
                    accepted_latents.append(z)
                    
                attempts += 1
                
        return accepted_samples, accepted_latents
    
    def active_learning_step(self, n_samples=10):
        """
        Select most informative samples for next round of rating
        Focuses on regions where the model is uncertain
        """
        if self.preference_model is None:
            raise ValueError("Train preference model first")
            
        self.preference_model.eval()
        candidates = []
        uncertainties = []
        
        with torch.no_grad():
            # Generate candidate pool
            for _ in range(n_samples * 10):
                z = torch.randn(1, self.latent_dim).to(self.device)
                score = self.preference_model(z).item()
                
                # Uncertainty is highest near 0.5
                uncertainty = 1 - abs(score - 0.5) * 2
                
                candidates.append(z)
                uncertainties.append(uncertainty)
        
        # Select most uncertain samples
        uncertainties = torch.tensor(uncertainties)
        top_indices = torch.topk(uncertainties, n_samples).indices
        
        selected_samples = []
        for idx in top_indices:
            z = candidates[idx]
            content = self.generator(z)
            selected_samples.append({
                'latent': z,
                'content': content,
                'uncertainty': uncertainties[idx].item()
            })
            
        return selected_samples

# Example usage
def example_usage():
    """
    Demonstrates how to use the PyTorch PLGL implementation
    """
    # Assume we have a pre-trained generator (e.g., StyleGAN, VAE, etc.)
    # generator = load_pretrained_generator()
    # latent_dim = 512
    
    # For demonstration, we'll use a simple generator
    class DemoGenerator(nn.Module):
        def __init__(self, latent_dim, output_dim):
            super().__init__()
            self.net = nn.Sequential(
                nn.Linear(latent_dim, 256),
                nn.ReLU(),
                nn.Linear(256, output_dim),
                nn.Tanh()
            )
            
        def forward(self, z):
            return self.net(z)
    
    # Initialize
    latent_dim = 64
    output_dim = 256
    generator = DemoGenerator(latent_dim, output_dim)
    
    plgl = PLGLPyTorch(generator, latent_dim, device='cuda' if torch.cuda.is_available() else 'cpu')
    
    # Step 1: Collect initial preferences
    print("Collecting initial preferences...")
    samples = plgl.collect_preferences(n_samples=100)
    
    # Step 2: Train preference model
    print("Training preference model...")
    plgl.train_preference_model(epochs=100)
    
    # Step 3: Find optimal content
    print("Finding optimal content...")
    optimal_content, optimal_z, score = plgl.find_optimal_latent()
    print(f"Optimal score: {score:.3f}")
    
    # Step 4: Generate distribution of good samples
    print("Generating high-preference samples...")
    good_samples, _ = plgl.generate_distribution(n_samples=50)
    
    # Step 5: Active learning for refinement
    print("Selecting samples for active learning...")
    uncertain_samples = plgl.active_learning_step(n_samples=10)
    
    return plgl, optimal_content, good_samples

if __name__ == "__main__":
    plgl, optimal, samples = example_usage()
import tensorflow as tf
import numpy as np

class PLGLTensorFlow:
    """
    TensorFlow 2.x-idiomatic PLGL implementation
    
    This implementation uses TensorFlow's native capabilities while maintaining
    the core PLGL philosophy. We use Keras models with regularization to
    approximate the smooth decision boundaries of the original SVM approach.
    """
    
    def __init__(self, generator, latent_dim):
        self.generator = generator
        self.latent_dim = latent_dim
        self.preference_model = None
        self.preference_data = {'latents': [], 'ratings': []}
        
    def collect_preferences(self, n_samples=100, user_rating_fn=None):
        """
        Collect user preferences using TensorFlow operations
        
        Args:
            n_samples: Number of samples to generate
            user_rating_fn: Function to get user ratings, None for simulation
        """
        samples = []
        
        for i in range(n_samples):
            # Sample from latent space
            z = tf.random.normal([1, self.latent_dim])
            
            # Generate content
            content = self.generator(z, training=False)
            
            # Get user rating
            if user_rating_fn:
                rating = user_rating_fn(content.numpy())
            else:
                # Simulated rating for demonstration
                rating = 1 if tf.norm(z) < 1.5 else 0
            
            self.preference_data['latents'].append(z.numpy())
            self.preference_data['ratings'].append(rating)
            
            samples.append({
                'latent': z,
                'content': content,
                'rating': rating
            })
            
        return samples
    
    def build_preference_model(self):
        """
        Build a regularized neural network that approximates SVM behavior
        Uses kernel regularization and dropout for smooth boundaries
        """
        self.preference_model = tf.keras.Sequential([
            tf.keras.layers.Dense(
                64, 
                activation='tanh',
                kernel_regularizer=tf.keras.regularizers.l2(0.01),
                input_shape=(self.latent_dim,)
            ),
            tf.keras.layers.Dropout(0.2),
            tf.keras.layers.Dense(
                32, 
                activation='tanh',
                kernel_regularizer=tf.keras.regularizers.l2(0.01)
            ),
            tf.keras.layers.Dense(1, activation='sigmoid')
        ])
        
        # Compile with appropriate optimizer and loss
        self.preference_model.compile(
            optimizer=tf.keras.optimizers.Adam(learning_rate=0.01),
            loss='binary_crossentropy',
            metrics=['accuracy']
        )
        
        return self.preference_model
    
    def train_preference_model(self, epochs=100, batch_size=32, validation_split=0.2):
        """
        Train preference model using Keras fit method
        """
        if not self.preference_data['latents']:
            raise ValueError("No preference data collected yet")
            
        # Prepare data
        X = np.array(self.preference_data['latents']).squeeze()
        y = np.array(self.preference_data['ratings'])
        
        # Build model if not exists
        if self.preference_model is None:
            self.build_preference_model()
            
        # Train with early stopping
        early_stop = tf.keras.callbacks.EarlyStopping(
            monitor='val_loss', patience=10, restore_best_weights=True
        )
        
        history = self.preference_model.fit(
            X, y,
            epochs=epochs,
            batch_size=batch_size,
            validation_split=validation_split,
            callbacks=[early_stop],
            verbose=1
        )
        
        return history
    
    @tf.function
    def find_optimal_latent_gradient(self, n_steps=1000, lr=0.1):
        """
        Find optimal latent using TensorFlow's GradientTape
        This is a single optimization run - call multiple times for restarts
        """
        # Initialize latent variable
        z = tf.Variable(tf.random.normal([1, self.latent_dim]))
        optimizer = tf.keras.optimizers.Adam(learning_rate=lr)
        
        for step in range(n_steps):
            with tf.GradientTape() as tape:
                # Get preference score
                score = self.preference_model(z)
                # Maximize score (minimize negative)
                loss = -score
                
            # Compute and apply gradients
            gradients = tape.gradient(loss, [z])
            optimizer.apply_gradients(zip(gradients, [z]))
            
            # Constrain to reasonable range
            z.assign(tf.clip_by_value(z, -3.0, 3.0))
            
        return z, self.preference_model(z)
    
    def find_optimal_latent(self, n_starts=10, n_steps=1000, lr=0.1):
        """
        Multiple restart optimization to find global optimum
        """
        if self.preference_model is None:
            raise ValueError("Train preference model first")
            
        best_z = None
        best_score = -float('inf')
        
        for i in range(n_starts):
            z, score = self.find_optimal_latent_gradient(n_steps, lr)
            score_val = score.numpy()[0][0]
            
            if score_val > best_score:
                best_score = score_val
                best_z = z.numpy()
                
        # Generate optimal content
        best_z_tensor = tf.constant(best_z)
        optimal_content = self.generator(best_z_tensor, training=False)
        
        return optimal_content, best_z_tensor, best_score
    
    @tf.function
    def generate_batch_scores(self, z_batch):
        """Efficiently score a batch of latent vectors"""
        return self.preference_model(z_batch)
    
    def generate_distribution(self, n_samples=100, threshold=0.7, temperature=1.0):
        """
        Generate samples from high-preference regions
        Uses TensorFlow operations for efficiency
        """
        if self.preference_model is None:
            raise ValueError("Train preference model first")
            
        accepted_samples = []
        accepted_latents = []
        
        # Use larger batches for efficiency
        batch_size = 1000
        attempts = 0
        max_attempts = n_samples * 100
        
        while len(accepted_samples) < n_samples and attempts < max_attempts:
            # Generate batch
            z_batch = tf.random.normal([batch_size, self.latent_dim]) * temperature
            
            # Score batch
            scores = self.generate_batch_scores(z_batch)
            
            # Find high-scoring samples
            mask = scores[:, 0] > threshold
            good_indices = tf.where(mask)
            
            # Generate content for good samples
            for idx in good_indices:
                if len(accepted_samples) >= n_samples:
                    break
                    
                z = z_batch[idx[0]:idx[0]+1]
                content = self.generator(z, training=False)
                accepted_samples.append(content)
                accepted_latents.append(z)
                
            attempts += batch_size
            
        return accepted_samples, accepted_latents
    
    def active_learning_step(self, n_samples=10, pool_size=1000):
        """
        Select uncertain samples for active learning
        Uses entropy-based uncertainty sampling
        """
        if self.preference_model is None:
            raise ValueError("Train preference model first")
            
        # Generate candidate pool
        z_pool = tf.random.normal([pool_size, self.latent_dim])
        
        # Get predictions
        scores = self.preference_model(z_pool)
        
        # Calculate uncertainty (entropy)
        # H = -p*log(p) - (1-p)*log(1-p)
        p = scores[:, 0]
        entropy = -p * tf.math.log(p + 1e-7) - (1-p) * tf.math.log(1-p + 1e-7)
        
        # Select top uncertain samples
        top_k = tf.nn.top_k(entropy, k=n_samples)
        uncertain_indices = top_k.indices
        
        selected_samples = []
        for idx in uncertain_indices:
            z = z_pool[idx:idx+1]
            content = self.generator(z, training=False)
            selected_samples.append({
                'latent': z,
                'content': content,
                'uncertainty': entropy[idx].numpy()
            })
            
        return selected_samples

# Advanced TensorFlow features
class AdvancedPLGLTensorFlow(PLGLTensorFlow):
    """
    Enhanced version with TensorFlow-specific optimizations
    """
    
    def build_preference_model_with_attention(self):
        """
        Build a more sophisticated model with attention mechanism
        Useful for high-dimensional latent spaces
        """
        inputs = tf.keras.Input(shape=(self.latent_dim,))
        
        # Self-attention on latent dimensions
        x = tf.keras.layers.Dense(64)(inputs)
        attention = tf.keras.layers.MultiHeadAttention(
            num_heads=4, key_dim=16
        )(x, x)
        x = tf.keras.layers.Add()([x, attention])
        x = tf.keras.layers.LayerNormalization()(x)
        
        # Classification head
        x = tf.keras.layers.Dense(32, activation='tanh')(x)
        x = tf.keras.layers.Dropout(0.2)(x)
        outputs = tf.keras.layers.Dense(1, activation='sigmoid')(x)
        
        self.preference_model = tf.keras.Model(inputs=inputs, outputs=outputs)
        self.preference_model.compile(
            optimizer='adam',
            loss='binary_crossentropy',
            metrics=['accuracy']
        )
        
        return self.preference_model
    
    @tf.function
    def parallel_optimization(self, n_starts=10, n_steps=1000):
        """
        Optimize multiple starting points in parallel
        Leverages TensorFlow's vectorization capabilities
        """
        # Initialize multiple latent variables
        z_batch = tf.Variable(tf.random.normal([n_starts, self.latent_dim]))
        optimizer = tf.keras.optimizers.Adam(learning_rate=0.1)
        
        for step in range(n_steps):
            with tf.GradientTape() as tape:
                # Score all candidates
                scores = self.preference_model(z_batch)
                # Maximize scores
                loss = -tf.reduce_sum(scores)
                
            gradients = tape.gradient(loss, [z_batch])
            optimizer.apply_gradients(zip(gradients, [z_batch]))
            
            # Constrain values
            z_batch.assign(tf.clip_by_value(z_batch, -3.0, 3.0))
            
        # Return best result
        final_scores = self.preference_model(z_batch)
        best_idx = tf.argmax(final_scores[:, 0])
        
        return z_batch[best_idx:best_idx+1], final_scores[best_idx]

# Example usage
def example_usage():
    """
    Demonstrates TensorFlow PLGL implementation
    """
    # Create a simple generator for demonstration
    generator = tf.keras.Sequential([
        tf.keras.layers.Dense(256, activation='relu', input_shape=(64,)),
        tf.keras.layers.Dense(512, activation='relu'),
        tf.keras.layers.Dense(256, activation='tanh')
    ])
    
    # Initialize PLGL
    latent_dim = 64
    plgl = PLGLTensorFlow(generator, latent_dim)
    
    # Collect preferences
    print("Collecting preferences...")
    samples = plgl.collect_preferences(n_samples=100)
    
    # Train model
    print("Training preference model...")
    history = plgl.train_preference_model(epochs=50)
    
    # Find optimal
    print("Finding optimal content...")
    optimal_content, optimal_z, score = plgl.find_optimal_latent()
    print(f"Optimal score: {score:.3f}")
    
    # Generate distribution
    print("Generating high-preference samples...")
    good_samples, _ = plgl.generate_distribution(n_samples=50)
    
    # Active learning
    print("Selecting uncertain samples...")
    uncertain = plgl.active_learning_step(n_samples=10)
    
    return plgl, optimal_content

if __name__ == "__main__":
    plgl, optimal = example_usage()
import jax
import jax.numpy as jnp
from jax import grad, jit, vmap, random
import optax
import flax.linen as nn
from typing import Callable, Tuple, List

class PLGLJAX:
    """
    JAX-idiomatic PLGL implementation
    
    JAX excels at functional programming and automatic differentiation.
    This implementation leverages JAX's strengths while maintaining the
    core PLGL philosophy through functional transformations.
    """
    
    def __init__(self, generator_fn: Callable, latent_dim: int, rng_key: jax.random.PRNGKey):
        self.generator_fn = generator_fn
        self.latent_dim = latent_dim
        self.rng_key = rng_key
        self.preference_params = None
        self.preference_data = {'latents': [], 'ratings': []}
        
    def collect_preferences(self, n_samples: int = 100, 
                          user_rating_fn: Callable = None) -> List[dict]:
        """
        Collect user preferences with JAX random number generation
        """
        samples = []
        key = self.rng_key
        
        for i in range(n_samples):
            # Split key for JAX random
            key, subkey = random.split(key)
            z = random.normal(subkey, (1, self.latent_dim))
            
            # Generate content
            content = self.generator_fn(z)
            
            # Get user rating
            if user_rating_fn:
                rating = user_rating_fn(jax.device_get(content))
            else:
                # Simulated rating for demonstration
                rating = 1 if jnp.linalg.norm(z) < 1.5 else 0
            
            self.preference_data['latents'].append(z)
            self.preference_data['ratings'].append(rating)
            
            samples.append({
                'latent': z,
                'content': content,
                'rating': rating
            })
            
        return samples
    
    def create_preference_model(self) -> nn.Module:
        """
        Define preference model using Flax (JAX's neural network library)
        Uses bounded activations and regularization for SVM-like behavior
        """
        class PreferenceModel(nn.Module):
            features: Tuple[int, ...] = (64, 32)
            
            @nn.compact
            def __call__(self, x):
                # First layer with regularization
                x = nn.Dense(self.features[0])(x)
                x = nn.tanh(x)  # Bounded activation
                x = nn.Dropout(0.2, deterministic=False)(x)
                
                # Second layer
                x = nn.Dense(self.features[1])(x)
                x = nn.tanh(x)
                
                # Output layer
                x = nn.Dense(1)(x)
                return nn.sigmoid(x)
                
        return PreferenceModel()
    
    def train_preference_model(self, epochs: int = 100, batch_size: int = 32, 
                             learning_rate: float = 0.01) -> dict:
        """
        Train preference model using JAX's functional approach
        """
        if not self.preference_data['latents']:
            raise ValueError("No preference data collected yet")
            
        # Prepare data
        X = jnp.concatenate(self.preference_data['latents'], axis=0)
        y = jnp.array(self.preference_data['ratings']).reshape(-1, 1)
        
        # Initialize model
        model = self.create_preference_model()
        key, subkey = random.split(self.rng_key)
        params = model.init(subkey, X[:1])
        
        # Define loss function
        def loss_fn(params, x_batch, y_batch, key):
            # Apply model with dropout
            logits = model.apply(params, x_batch, rngs={'dropout': key})
            # Binary cross-entropy loss
            loss = -jnp.mean(
                y_batch * jnp.log(logits + 1e-7) + 
                (1 - y_batch) * jnp.log(1 - logits + 1e-7)
            )
            # Add L2 regularization
            l2_loss = sum(jnp.sum(p**2) for p in jax.tree_leaves(params))
            return loss + 0.01 * l2_loss
        
        # Create optimizer
        optimizer = optax.adam(learning_rate)
        opt_state = optimizer.init(params)
        
        # Training step
        @jit
        def train_step(params, opt_state, x_batch, y_batch, key):
            loss, grads = jax.value_and_grad(loss_fn)(params, x_batch, y_batch, key)
            updates, opt_state = optimizer.update(grads, opt_state)
            params = optax.apply_updates(params, updates)
            return params, opt_state, loss
        
        # Training loop
        n_batches = len(X) // batch_size
        for epoch in range(epochs):
            # Shuffle data
            key, subkey = random.split(key)
            perm = random.permutation(subkey, len(X))
            X_shuffled = X[perm]
            y_shuffled = y[perm]
            
            total_loss = 0
            for i in range(n_batches):
                start_idx = i * batch_size
                end_idx = start_idx + batch_size
                
                x_batch = X_shuffled[start_idx:end_idx]
                y_batch = y_shuffled[start_idx:end_idx]
                
                key, subkey = random.split(key)
                params, opt_state, loss = train_step(
                    params, opt_state, x_batch, y_batch, subkey
                )
                total_loss += loss
                
            if (epoch + 1) % 20 == 0:
                avg_loss = total_loss / n_batches
                print(f"Epoch {epoch+1}/{epochs}, Loss: {avg_loss:.4f}")
                
        self.preference_params = params
        self.preference_model = model
        return params
    
    @jit
    def preference_score(self, params, model, z):
        """JIT-compiled preference scoring"""
        return model.apply(params, z, rngs={'dropout': None})
    
    def find_optimal_latent(self, n_starts: int = 10, n_steps: int = 1000, 
                          learning_rate: float = 0.1) -> Tuple[jnp.ndarray, jnp.ndarray, float]:
        """
        Find optimal latent using JAX's gradient-based optimization
        """
        if self.preference_params is None:
            raise ValueError("Train preference model first")
            
        # Define objective function
        @jit
        def objective(z):
            score = self.preference_score(self.preference_params, self.preference_model, z)
            return -score[0, 0]  # Minimize negative score
        
        # Gradient of objective
        grad_fn = jit(grad(objective))
        
        best_z = None
        best_score = -float('inf')
        key = self.rng_key
        
        for _ in range(n_starts):
            # Random initialization
            key, subkey = random.split(key)
            z = random.normal(subkey, (1, self.latent_dim))
            
            # Optimization using optax
            optimizer = optax.adam(learning_rate)
            opt_state = optimizer.init(z)
            
            @jit
            def step(z, opt_state):
                grads = grad_fn(z)
                updates, opt_state = optimizer.update(grads, opt_state)
                z = optax.apply_updates(z, updates)
                # Constrain to reasonable range
                z = jnp.clip(z, -3.0, 3.0)
                return z, opt_state
            
            # Run optimization
            for _ in range(n_steps):
                z, opt_state = step(z, opt_state)
                
            # Evaluate final score
            score = -objective(z)
            if score > best_score:
                best_score = score
                best_z = z
                
        # Generate optimal content
        optimal_content = self.generator_fn(best_z)
        
        return optimal_content, best_z, best_score
    
    def generate_distribution(self, n_samples: int = 100, threshold: float = 0.7, 
                            temperature: float = 1.0) -> Tuple[List[jnp.ndarray], List[jnp.ndarray]]:
        """
        Generate samples from high-preference regions using JAX's vectorization
        """
        if self.preference_params is None:
            raise ValueError("Train preference model first")
            
        accepted_samples = []
        accepted_latents = []
        key = self.rng_key
        
        # Vectorized scoring function
        score_batch = vmap(lambda z: self.preference_score(
            self.preference_params, self.preference_model, z.reshape(1, -1)
        ))
        
        while len(accepted_samples) < n_samples:
            # Generate batch
            batch_size = min(1000, (n_samples - len(accepted_samples)) * 10)
            key, subkey = random.split(key)
            z_batch = random.normal(subkey, (batch_size, self.latent_dim)) * temperature
            
            # Score batch efficiently
            scores = score_batch(z_batch).squeeze()
            
            # Find high-scoring samples
            good_indices = jnp.where(scores > threshold)[0]
            
            # Generate content for good samples
            for idx in good_indices:
                if len(accepted_samples) >= n_samples:
                    break
                    
                z = z_batch[idx:idx+1]
                content = self.generator_fn(z)
                accepted_samples.append(content)
                accepted_latents.append(z)
                
        return accepted_samples, accepted_latents
    
    def active_learning_step(self, n_samples: int = 10, pool_size: int = 1000) -> List[dict]:
        """
        Select uncertain samples using JAX's functional transformations
        """
        if self.preference_params is None:
            raise ValueError("Train preference model first")
            
        # Generate candidate pool
        key, subkey = random.split(self.rng_key)
        z_pool = random.normal(subkey, (pool_size, self.latent_dim))
        
        # Vectorized scoring
        score_batch = vmap(lambda z: self.preference_score(
            self.preference_params, self.preference_model, z.reshape(1, -1)
        ))
        scores = score_batch(z_pool).squeeze()
        
        # Calculate uncertainty (entropy)
        entropy = -scores * jnp.log(scores + 1e-7) - (1-scores) * jnp.log(1-scores + 1e-7)
        
        # Select most uncertain
        top_indices = jnp.argsort(-entropy)[:n_samples]
        
        selected_samples = []
        for idx in top_indices:
            z = z_pool[idx:idx+1]
            content = self.generator_fn(z)
            selected_samples.append({
                'latent': z,
                'content': content,
                'uncertainty': entropy[idx]
            })
            
        return selected_samples

# Advanced JAX features
class AdvancedPLGLJAX(PLGLJAX):
    """
    Enhanced version leveraging advanced JAX features
    """
    
    @jit
    def parallel_optimization(self, n_starts: int = 10, n_steps: int = 1000) -> Tuple[jnp.ndarray, float]:
        """
        Optimize multiple starting points in parallel using vmap
        """
        if self.preference_params is None:
            raise ValueError("Train preference model first")
            
        # Initialize multiple starting points
        key, subkey = random.split(self.rng_key)
        z_init = random.normal(subkey, (n_starts, self.latent_dim))
        
        # Define single optimization trajectory
        def optimize_single(z0):
            # Objective for this trajectory
            def objective(z):
                score = self.preference_score(self.preference_params, self.preference_model, z.reshape(1, -1))
                return -score[0, 0]
            
            # Gradient
            grad_fn = grad(objective)
            
            # Optimization loop
            z = z0
            optimizer = optax.adam(0.1)
            opt_state = optimizer.init(z)
            
            for _ in range(n_steps):
                grads = grad_fn(z)
                updates, opt_state = optimizer.update(grads, opt_state)
                z = optax.apply_updates(z, updates)
                z = jnp.clip(z, -3.0, 3.0)
                
            return z, -objective(z)
        
        # Vectorize over starting points
        optimize_batch = vmap(optimize_single)
        final_z, final_scores = optimize_batch(z_init)
        
        # Select best result
        best_idx = jnp.argmax(final_scores)
        return final_z[best_idx], final_scores[best_idx]

# Example usage
def example_usage():
    """
    Demonstrates JAX PLGL implementation
    """
    # Simple generator for demonstration
    def generator_fn(z):
        # Simple linear transformation
        W = jax.random.normal(jax.random.PRNGKey(42), (z.shape[-1], 256))
        return jnp.tanh(z @ W)
    
    # Initialize
    latent_dim = 64
    rng_key = jax.random.PRNGKey(0)
    plgl = PLGLJAX(generator_fn, latent_dim, rng_key)
    
    # Collect preferences
    print("Collecting preferences...")
    samples = plgl.collect_preferences(n_samples=100)
    
    # Train model
    print("Training preference model...")
    params = plgl.train_preference_model(epochs=50)
    
    # Find optimal
    print("Finding optimal content...")
    optimal_content, optimal_z, score = plgl.find_optimal_latent()
    print(f"Optimal score: {score:.3f}")
    
    # Generate distribution
    print("Generating high-preference samples...")
    good_samples, _ = plgl.generate_distribution(n_samples=50)
    
    # Active learning
    print("Selecting uncertain samples...")
    uncertain = plgl.active_learning_step(n_samples=10)
    
    return plgl, optimal_content

if __name__ == "__main__":
    plgl, optimal = example_usage()
"""
Original PLGL Implementation (2018-2019)
SkinDeep.ai Inc - Historical Reference

This is the original numpy-based implementation that pioneered
preference learning in generative latent spaces.
"""

import numpy as np
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from scipy.optimize import minimize
import pickle

class OriginalPLGL:
    """
    Original 2018-2019 implementation using NumPy and scikit-learn
    Designed for StyleGAN latent space navigation
    """
    
    def __init__(self, latent_dim=512, generator_func=None):
        self.latent_dim = latent_dim
        self.generator_func = generator_func
        self.classifier = None
        self.user_ratings = []
        self.latent_vectors = []
        
    def collect_preferences(self, n_samples=100, save_path=None):
        """
        Original preference collection approach
        Used random sampling with human-in-the-loop rating
        """
        print(f"Collecting {n_samples} preference ratings...")
        
        for i in range(n_samples):
            # Sample from standard normal (StyleGAN convention)
            z = np.random.randn(1, self.latent_dim)
            
            # Generate content (originally face images)
            if self.generator_func:
                content = self.generator_func(z)
                # Display to user and collect rating
                # In the original app, this was done via mobile UI
                rating = self._get_user_rating(content)
            else:
                # Simulated for demonstration
                rating = np.random.choice([0, 1])
            
            self.latent_vectors.append(z[0])
            self.user_ratings.append(rating)
            
            if save_path and i % 10 == 0:
                self._save_checkpoint(save_path)
                
        return np.array(self.latent_vectors), np.array(self.user_ratings)
    
    def train_classifier(self, kernel='rbf', C=1.0, gamma='scale'):
        """
        Original approach: SVM classifier for preference prediction
        Found to work well in high-dimensional latent spaces
        """
        if len(self.user_ratings) < 10:
            raise ValueError("Need at least 10 ratings to train")
            
        X = np.array(self.latent_vectors)
        y = np.array(self.user_ratings)
        
        # Split for validation
        X_train, X_test, y_train, y_test = train_test_split(
            X, y, test_size=0.2, random_state=42
        )
        
        # Train SVM classifier
        self.classifier = SVC(kernel=kernel, C=C, gamma=gamma, probability=True)
        self.classifier.fit(X_train, y_train)
        
        # Report accuracy
        train_acc = self.classifier.score(X_train, y_train)
        test_acc = self.classifier.score(X_test, y_test)
        
        print(f"Training accuracy: {train_acc:.2f}")
        print(f"Test accuracy: {test_acc:.2f}")
        
        return self.classifier
    
    def reverse_classification(self, n_starts=10, method='L-BFGS-B'):
        """
        Reverse Classificationโ„ข: Find optimal latent vector that maximizes preference
        
        This is the core innovation of PLGL - instead of classifying existing content,
        we reverse the process to find the latent code that would produce the most
        preferred content according to the trained classifier.
        
        Original optimization: Scipy minimize with multiple restarts
        Objective: Find z that maximizes classifier confidence
        """
        if self.classifier is None:
            raise ValueError("Train classifier first")
            
        best_z = None
        best_score = -np.inf
        
        def objective(z):
            # Reshape for classifier
            z_reshaped = z.reshape(1, -1)
            # Get probability of positive class
            prob = self.classifier.predict_proba(z_reshaped)[0, 1]
            # Minimize negative probability (maximize positive)
            return -prob
        
        # Multiple random restarts for global optimization
        for _ in range(n_starts):
            # Random initialization
            z0 = np.random.randn(self.latent_dim)
            
            # Optimize
            result = minimize(
                objective,
                z0,
                method=method,
                options={'maxiter': 100}
            )
            
            if -result.fun > best_score:
                best_score = -result.fun
                best_z = result.x
                
        print(f"Found optimal with score: {best_score:.3f}")
        return best_z
    
    def find_optimal_latent(self, *args, **kwargs):
        """Alias for reverse_classification for backward compatibility"""
        return self.reverse_classification(*args, **kwargs)
    
    def generate_distribution(self, n_samples=100, threshold=0.7):
        """
        Original distribution generation approach
        Sample and filter based on classifier confidence
        """
        if self.classifier is None:
            raise ValueError("Train classifier first")
            
        accepted_samples = []
        total_tried = 0
        
        while len(accepted_samples) < n_samples:
            # Batch sampling for efficiency
            batch_size = min(100, (n_samples - len(accepted_samples)) * 2)
            z_batch = np.random.randn(batch_size, self.latent_dim)
            
            # Get probabilities
            probs = self.classifier.predict_proba(z_batch)[:, 1]
            
            # Accept high-scoring samples
            accepted_idx = probs > threshold
            accepted_samples.extend(z_batch[accepted_idx])
            
            total_tried += batch_size
            
            # Prevent infinite loop
            if total_tried > n_samples * 100:
                print(f"Warning: Could only find {len(accepted_samples)} samples")
                break
                
        return np.array(accepted_samples[:n_samples])
    
    def iterative_refinement(self, n_iterations=5, samples_per_iter=20):
        """
        Original iterative improvement strategy
        Alternates between generation and rating collection
        """
        print("Starting iterative refinement process...")
        
        for iteration in range(n_iterations):
            print(f"\nIteration {iteration + 1}/{n_iterations}")
            
            # Generate samples from current model
            if iteration == 0:
                # First iteration: random sampling
                new_samples = np.random.randn(samples_per_iter, self.latent_dim)
            else:
                # Later iterations: guided by current classifier
                distribution = self.generate_distribution(samples_per_iter * 2)
                # Add some random samples for exploration
                guided = distribution[:int(samples_per_iter * 0.8)]
                random = np.random.randn(int(samples_per_iter * 0.2), self.latent_dim)
                new_samples = np.vstack([guided, random])
            
            # Collect ratings for new samples
            for z in new_samples:
                if self.generator_func:
                    content = self.generator_func(z.reshape(1, -1))
                    rating = self._get_user_rating(content)
                else:
                    # Simulated
                    rating = np.random.choice([0, 1])
                    
                self.latent_vectors.append(z)
                self.user_ratings.append(rating)
            
            # Retrain classifier with all data
            self.train_classifier()
            
            # Find current optimal using reverse classification
            optimal_z = self.reverse_classification()
            
            print(f"Total ratings collected: {len(self.user_ratings)}")
            
    def _get_user_rating(self, content):
        """
        In the original app, this displayed content on mobile device
        and collected swipe left (0) or swipe right (1)
        """
        # Placeholder for actual user interaction
        return np.random.choice([0, 1])
    
    def _save_checkpoint(self, path):
        """Save current state for recovery"""
        checkpoint = {
            'latent_vectors': self.latent_vectors,
            'user_ratings': self.user_ratings,
            'classifier': self.classifier
        }
        with open(path, 'wb') as f:
            pickle.dump(checkpoint, f)
            
    def load_checkpoint(self, path):
        """Load saved state"""
        with open(path, 'rb') as f:
            checkpoint = pickle.load(f)
        self.latent_vectors = checkpoint['latent_vectors']
        self.user_ratings = checkpoint['user_ratings']
        self.classifier = checkpoint['classifier']


# Example usage demonstrating the original workflow
if __name__ == "__main__":
    print("=== Original PLGL Implementation Demo ===")
    print("Historical reference from SkinDeep.ai (2018-2019)\n")
    
    # Initialize with StyleGAN dimensions
    plgl = OriginalPLGL(latent_dim=512)
    
    # Simulate the original data collection process
    print("Phase 1: Initial preference collection")
    plgl.collect_preferences(n_samples=50)
    
    print("\nPhase 2: Train preference classifier")
    plgl.train_classifier()
    
    print("\nPhase 3: Reverse Classification - Find optimal latent vector")
    optimal_z = plgl.reverse_classification()
    
    print("\nPhase 4: Generate preference distribution")
    distribution = plgl.generate_distribution(n_samples=10)
    print(f"Generated {len(distribution)} samples matching preferences")
    
    print("\nPhase 5: Iterative refinement")
    plgl.iterative_refinement(n_iterations=3)
    
    print("\nโœจ This approach pioneered preference learning in latent spaces!")
    print("Now evolved into the modern PLGL framework with deep learning.")

# Original implementation references:
# Server: https://github.com/skindeepai/skindeep-server/blob/master/server.py
# Mobile App: https://github.com/skindeepai/skindeep-mobile
# API: https://github.com/skindeepai/skindeep-server

Ready to Personalize AI?

Join the revolution in preference-driven content generation with PLGL by SkinDeep.ai Inc