Generative Ai

This article primarily focusses on Computer Vision and Diffusion models.

Real World Applications

Video/Image Restoration
- Take an old video or photo that is low quality or blurred and improve it using DL.
Image Editing and Synthesis using text commands:
- “Make my smile wider” - Text suggested edits
- “segment image of guy wearing blue shirt and brown pants from an image”
Text to speech Synthesis
- Here is a good summary of TTS algorithms from aiSummer School
Speech to text
- OpenAI’s whisper
Audio Generation
- Riffusion
Code synthesis
Generating Fakes (Photo’s, Videos, Personas)
- This is bread and butter for Generative algorithms

Text guided image generation also referred as Classifier Guidance
In-Painting: This refers to the process of filling in missing or corrupted parts of an image or video with plausible content. Generative models, such as Variational Autoencoders (VAEs) and Generative Adversarial Networks (GANs), can be trained to learn the underlying distribution of the data, and can then be used to generate new content that is consistent with the surrounding area.

Style Transfer: This is the process of applying the style of one image to another image, while preserving the content of the original image. This is typically done by training a generative model to separate the style and content representations of an image, and then recombining the content of one image with the style of another image.
Upscaling Images:
- Super-resolution: This refers to the process of increasing the resolution of an image. Generative models, such as GANs, can be trained to learn the mapping from low-resolution images to high-resolution images.
Few Shot Learning:
- Neural Network Pre-Training: This refers to the process of training a generative model on a large dataset, and then using the learned representations as a starting point for fine-tuning on a smaller dataset. This can be useful when the amount of labeled data is limited, as the pre-trained model can provide a good initialization that allows the model to quickly converge to a good solution when fine-tuning on the smaller dataset.
Reinforcement Learning Exploration: Generative models can be used in Reinforcement Learning (RL) to help improve exploration. For example, a GAN can be trained to generate new samples that are similar to existing samples in the training data, but with slight variations. These generated samples can then be used to expand the state space of the RL agent, allowing it to explore and learn from a wider range of scenarios.

Clip for multi-modal
Prompt Engg, Chain of thought prompting
Reinforcing behavior based on human feedback RHLF
Stable Diffusion: Combine superpowers of VAE’s and Diffusion models to make things faster
Super-resolution: Guided Diffusion model trained on Large resolution with guidance on the small resolution image
Cascaded Diffusion Models: A small resolution text conditioned/class conditioned diffusion model chained with multiple super resolution images Cascaded Diffusion Models
Textual Inversion

Quidgest article on Generative AI:
- Industry impact and predictions about generative AI
- Applications in the industry
Canary Mail

Want to connect? Reach out @varuntul22.