Vibe Coding vs. Agentic Coding: Fundamentals and Practical Implications of Agentic AI Paper • 2505.19443 • Published May 26, 2025 • 15
Skywork-SWE: Unveiling Data Scaling Laws for Software Engineering in LLMs Paper • 2506.19290 • Published Jun 24, 2025 • 52
CodeNet: A Large-Scale AI for Code Dataset for Learning a Diversity of Coding Tasks Paper • 2105.12655 • Published May 25, 2021
SWE-Synth: Synthesizing Verifiable Bug-Fix Data to Enable Large Language Models in Resolving Real-World Bugs Paper • 2504.14757 • Published Apr 20, 2025
OctoPack: Instruction Tuning Code Large Language Models Paper • 2308.07124 • Published Aug 14, 2023 • 31
rStar-Coder: Scaling Competitive Code Reasoning with a Large-Scale Verified Dataset Paper • 2505.21297 • Published May 27, 2025 • 29
Developer-LLM Conversations: An Empirical Study of Interactions and Generated Code Quality Paper • 2509.10402 • Published Sep 12, 2025 • 5
WizardCoder: Empowering Code Large Language Models with Evol-Instruct Paper • 2306.08568 • Published Jun 14, 2023 • 32
Granite Code Models: A Family of Open Foundation Models for Code Intelligence Paper • 2405.04324 • Published May 7, 2024 • 25
Knowledge Transfer from High-Resource to Low-Resource Programming Languages for Code LLMs Paper • 2308.09895 • Published Aug 19, 2023 • 1
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models Paper • 2411.04905 • Published Nov 7, 2024 • 127
OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement Paper • 2402.14658 • Published Feb 22, 2024 • 83
Infinity Instruct: Scaling Instruction Selection and Synthesis to Enhance Language Models Paper • 2506.11116 • Published Jun 9, 2025 • 4
Thinking LLMs: General Instruction Following with Thought Generation Paper • 2410.10630 • Published Oct 14, 2024 • 20
KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding Paper • 2503.02951 • Published Mar 4, 2025 • 33
SWE-QA: Can Language Models Answer Repository-level Code Questions? Paper • 2509.14635 • Published Sep 18, 2025 • 34
CodeDPO: Aligning Code Models with Self Generated and Verified Source Code Paper • 2410.05605 • Published Oct 8, 2024 • 1
CodeSteer: Symbolic-Augmented Language Models via Code/Text Guidance Paper • 2502.04350 • Published Feb 4, 2025 • 11
Genetic Instruct: Scaling up Synthetic Generation of Coding Instructions for Large Language Models Paper • 2407.21077 • Published Jul 29, 2024 • 2
OpenCodeReasoning: Advancing Data Distillation for Competitive Coding Paper • 2504.01943 • Published Apr 2, 2025 • 15
Training Long-Context, Multi-Turn Software Engineering Agents with Reinforcement Learning Paper • 2508.03501 • Published Aug 5, 2025 • 59
PromptCoT 2.0: Scaling Prompt Synthesis for Large Language Model Reasoning Paper • 2509.19894 • Published Sep 24, 2025 • 33
CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction Paper • 2502.07316 • Published Feb 11, 2025 • 50
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing Paper • 2406.08464 • Published Jun 12, 2024 • 71
BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution Paper • 2510.08697 • Published Oct 9, 2025 • 36
Critique-Coder: Enhancing Coder Models by Critique Reinforcement Learning Paper • 2509.22824 • Published Sep 26, 2025 • 20
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning Paper • 2506.01939 • Published Jun 2, 2025 • 187
X-Coder: Advancing Competitive Programming with Fully Synthetic Tasks, Solutions, and Tests Paper • 2601.06953 • Published 3 days ago • 30