AI Research

Curating Pioneering Research

Adapting Large Language Models via Reading Comprehension

/Recents:

Prof. Otto NomosMay 27, 2024 ∙ 2 min read

Adapting Large Language Models via Reading Comprehension

Abstract Commentary & Rating

Prof. Otto NomosMay 27, 2024 ∙ 2 min read
Cure the headache of Transformers via Collinear Constrained Attention
Abstract Commentary & Rating
Prof. Otto NomosOct 03, 2023 ∙ 2 min read
Natural Language Supervision for General-Purpose Audio Representations
Abstract Commentary & Rating
Prof. Otto NomosOct 03, 2023 ∙ 2 min read
StyleAdapter: A Single-Pass LoRA-Free Model for Stylized Image Generation
Abstract Commentary & Rating

/AgentsGo to topic

/AlignmentGo to topic

/AudioGo to topic

/ChatGo to topic

/ClimateGo to topic

WeatherBench 2: A benchmark for the next generation of data-driven global weather models
WeatherBench 2: A benchmark for the next generation of data-driven global weather models
Prof. Otto NomosOct 02, 2023 ∙ 2 min read
Abstract Commentary & Rating

/CodeGo to topic

/CommentaryGo to topic

/CompressionGo to topic

/DataGo to topic

/DiffusionGo to topic

/EdgeGo to topic

Recovering from Privacy-Preserving Masking with Large Language Models
Recovering from Privacy-Preserving Masking with Large Language Models
Prof. Otto NomosMay 25, 2024 ∙ 2 min read
Abstract Commentary & Rating

/EntityGo to topic

/EvaluationGo to topic

/Fine-tuningGo to topic

/GamingGo to topic

MindAgent: Emergent Gaming Interaction
MindAgent: Emergent Gaming Interaction
Prof. Otto NomosMay 27, 2024 ∙ 2 min read
Abstract Commentary & Rating

/HallucinationGo to topic

/HealthcareGo to topic

/ImageGo to topic

/In-Context LearningGo to topic

/InferenceGo to topic

Accelerating LLM Inference with Staged Speculative Decoding
Accelerating LLM Inference with Staged Speculative Decoding
Prof. Otto NomosOct 02, 2023 ∙ 1 min read
Abstract Commentary & Rating

/Instruction TuningGo to topic

/InterpretabilityGo to topic

Sparse Autoencoders Find Highly Interpretable Features in Language Models
Sparse Autoencoders Find Highly Interpretable Features in Language Models
Prof. Otto NomosOct 04, 2023 ∙ 2 min read
Abstract Commentary & Rating

/LegalGo to topic

SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore
SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore
Prof. Otto NomosOct 02, 2023 ∙ 2 min read
Abstract Commentary & Rating

/LLMGo to topic

/LORAGo to topic

/MathGo to topic

/MultilingualGo to topic

/MultimodalGo to topic

/Open SourceGo to topic

/PersonalizationGo to topic

Teach LLMs to Personalize -- An Approach inspired by Writing Education
Teach LLMs to Personalize -- An Approach inspired by Writing Education
Prof. Otto NomosOct 02, 2023 ∙ 2 min read
Abstract Commentary & Rating

/PrivacyGo to topic

Recovering from Privacy-Preserving Masking with Large Language Models
Recovering from Privacy-Preserving Masking with Large Language Models
Prof. Otto NomosMay 25, 2024 ∙ 2 min read
Abstract Commentary & Rating

/PromptingGo to topic

/ReasoningGo to topic

/Reinforcement LearningGo to topic

Statistical Rejection Sampling Improves Preference Optimization
Statistical Rejection Sampling Improves Preference Optimization
Prof. Otto NomosOct 04, 2023 ∙ 2 min read
Abstract Commentary & Rating

/RetrievalGo to topic

/RLHFGo to topic

/SafetyGo to topic

FLIRT: Feedback Loop In-context Red Teaming
FLIRT: Feedback Loop In-context Red Teaming
Prof. Otto NomosOct 02, 2023 ∙ 1 min read
Abstract Commentary & Rating

/ScienceGo to topic

/Structured DataGo to topic

/SummarizationGo to topic

/SummaryGo to topic

/SurveyGo to topic

/ToolsGo to topic

/TrainingGo to topic

/TransformersGo to topic

Subscribe For The Latest Updates Subscribe to the newsletter and never miss the new post every week.

AI Research

Adapting Large Language Models via Reading Comprehension

Cure the headache of Transformers via Collinear Constrained Attention

Natural Language Supervision for General-Purpose Audio Representations

StyleAdapter: A Single-Pass LoRA-Free Model for Stylized Image Generation

MindAgent: Emergent Gaming Interaction

MindAgent: Emergent Gaming Interaction

A Data Source for Reasoning Embodied Agents

A Data Source for Reasoning Embodied Agents

LASER: LLM Agent with State-Space Exploration for Web Navigation

LASER: LLM Agent with State-Space Exploration for Web Navigation

Agents: An Open-source Framework for Autonomous Language Agents

Agents: An Open-source Framework for Autonomous Language Agents

From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models

From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models

Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment

Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment

Simple synthetic data reduces sycophancy in large language models

Simple synthetic data reduces sycophancy in large language models

Augmenting text for spoken language understanding with Large Language Models

Augmenting text for spoken language understanding with Large Language Models

Natural Language Supervision for General-Purpose Audio Representations

Natural Language Supervision for General-Purpose Audio Representations

Improving Joint Speech-Text Representations Without Alignment

Improving Joint Speech-Text Representations Without Alignment

S3-DST: Structured Open-Domain Dialogue Segmentation and State Tracking in the Era of LLMs

S3-DST: Structured Open-Domain Dialogue Segmentation and State Tracking in the Era of LLMs

Investigating Answerability of LLMs for Long-Form Question Answering

Investigating Answerability of LLMs for Long-Form Question Answering

SoTaNa: The Open-Source Software Development Assistant

SoTaNa: The Open-Source Software Development Assistant

PIPPA: A Partially Synthetic Conversational Dataset

PIPPA: A Partially Synthetic Conversational Dataset

WeatherBench 2: A benchmark for the next generation of data-driven global weather models

WeatherBench 2: A benchmark for the next generation of data-driven global weather models

Large Language Models for Compiler Optimization

Large Language Models for Compiler Optimization

LLM As DBA

LLM As DBA

BioCoder: A Benchmark for Bioinformatics Code Generation with Contextual Pragmatic Knowledge

BioCoder: A Benchmark for Bioinformatics Code Generation with Contextual Pragmatic Knowledge

Can Programming Languages Boost Each Other via Instruction Tuning?

Can Programming Languages Boost Each Other via Instruction Tuning?

LoraHub: Efficient Cross-Task Generalization Via Dynamic LoRA Composition [Commentary]

LoraHub: Efficient Cross-Task Generalization Via Dynamic LoRA Composition [Commentary]

ToolLLM: Facilitating Large Language Models To Master 16000+ Real-World APIs [Commentary]

ToolLLM: Facilitating Large Language Models To Master 16000+ Real-World APIs [Commentary]

Language Modeling Is Compression

Language Modeling Is Compression

A Survey on Model Compression for Large Language Models

A Survey on Model Compression for Large Language Models

SlimPajama-DC: Understanding Data Combinations for LLM Training

SlimPajama-DC: Understanding Data Combinations for LLM Training

CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages

CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages

A Data Source for Reasoning Embodied Agents

A Data Source for Reasoning Embodied Agents

When Less is More: Investigating Data Pruning for Pretraining LLMs at Scale

When Less is More: Investigating Data Pruning for Pretraining LLMs at Scale

PDE-Refiner: Achieving Accurate Long Rollouts with Neural PDE Solvers

PDE-Refiner: Achieving Accurate Long Rollouts with Neural PDE Solvers

FLIRT: Feedback Loop In-context Red Teaming

FLIRT: Feedback Loop In-context Red Teaming

Recovering from Privacy-Preserving Masking with Large Language Models

Recovering from Privacy-Preserving Masking with Large Language Models

LMDX: Language Model-based Document Information Extraction and Localization

LMDX: Language Model-based Document Information Extraction and Localization

Leveraging Contextual Information for Effective Entity Salience Detection

Leveraging Contextual Information for Effective Entity Salience Detection

Are Large Language Model-based Evaluators the Solution to Scaling Up Multilingual Evaluation?

Are Large Language Model-based Evaluators the Solution to Scaling Up Multilingual Evaluation?

The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants

The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants

WeatherBench 2: A benchmark for the next generation of data-driven global weather models

WeatherBench 2: A benchmark for the next generation of data-driven global weather models

Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment

Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment

Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large Language Models for Dynamic Inference Using Sorted Fine-Tuning (SoFT)

Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large Language Models for Dynamic Inference Using Sorted Fine-Tuning (SoFT)

Leveraging Contextual Information for Effective Entity Salience Detection