ICSE 2023

211 papers accepted.

Updated on 2023-09-08.

You can find the lastest information here.


Message from the ICSE 2023 Program Co-Chairs.

Predicting Bugs by Monitoring Developers During Task Execution.

Future Software for Life in Trusted Futures.

The Road Toward Dependable AI Based Systems.

Software Engineering as the Linchpin of Responsible AI.

One Adapter for All Programming Languages? Adapter Tuning for Code Search and Summarization.

CCRep: Learning Code Change Representations via Pre-Trained Code Model and Query Back.

Keeping Pace with Ever-Increasing Data: Towards Continual Learning of Code Intelligence Models.

Detecting JVM JIT Compiler Bugs via Exploring Two-Dimensional Input Spaces.

JITfuzz: Coverage-guided Fuzzing for JVM Just-in-Time Compilers.

Validating SMT Solvers via Skeleton Enumeration Empowered by Historical Bug-Triggering Inputs.

Regression Fuzzing for Deep Learning Systems.

Operand-Variation-Oriented Differential Analysis for Fuzzing Binding Calls in PDF Readers.

The untold story of code refactoring customizations in practice.

Data Quality for Software Vulnerability Datasets.

Do code refactorings influence the merge effort?

A Comprehensive Study of Real-World Bugs in Machine Learning Model Optimization.

Evaluating the Impact of Experimental Assumptions in Automated Fault Localization.

Locating Framework-specific Crashing Faults with Compact and Explainable Candidate Set.

PExReport: Automatic Creation of Pruned Executable Cross-Project Failure Reports.

RAT: A Refactoring-Aware Traceability Model for Bug Localization.

How Do We Read Formal Claims? Eye-Tracking and the Cognition of Proofs about Algorithms.

Which of My Assumptions are Unnecessary for Realizability and Why Should I Care?

UPCY: Safely Updating Outdated Dependencies.

APICAD: Augmenting API Misuse Detection through Specifications from Code and Documents.

Compatibility Issue Detection for Android Apps Based on Path-Sensitive Semantic Analysis.

OSSFP: Precise and Scalable C/C++ Third-Party Library Detection using Fingerprinting Functions.

Smartmark: Software Watermarking Scheme for Smart Contracts.

Turn the Rudder: A Beacon of Reentrancy Detection for Smart Contracts on Ethereum.

BSHUNTER: Detecting and Tracing Defects of Bitcoin Scripts.

Do I Belong? Modeling Sense of Virtual Community Among Linux Kernel Contributors.

Comparison and Evaluation of Clone Detection Techniques with Different Code Representations.

Learning Graph-based Code Representations for Source-level Functional Similarity Detection.

The Smelly Eight: An Empirical Study on the Prevalence of Code Smells in Quantum Computing.

Reachable Coverage: Estimating Saturation in Fuzzing.

Learning Seed-Adaptive Mutation Strategies for Greybox Fuzzing.

Improving Java Deserialization Gadget Chain Mining via Overriding-Guided Object Generation.

Evaluating and Improving Hybrid Fuzzing.

Robustification of Behavioral Designs against Environmental Deviations.

A Qualitative Study on the Implementation Design Decisions of Developers.

BFTDETECTOR: Automatic Detection of Business Flow Tampering for Digital Content Service.

FedSlice: Protecting Federated Learning Models from Malicious Participants with Model Slicing.

PTPDroid: Detecting Violated User Privacy Disclosures to Third-Parties of Android Apps.

Adhere: Automated Detection and Repair of Intrusive Ads.

Bad Snakes: Understanding and Improving Python Package Index Malware Scanning.

FedDebug: Systematic Debugging for Federated Learning Applications.

Practical and Efficient Model Extraction of Sentiment Analysis APIs.

CrossCodeBench: Benchmarking Cross-Task Generalization of Source Code Models.

ECSTATIC: An Extensible Framework for Testing and Debugging Configurable Static Analysis.

Responsibility in Context: On Applicability of Slicing in Semantic Regression Analysis.

Does the Stream API Benefit from Special Debugging Facilities? A Controlled Experiment on Loops and Streams with Specific Debuggers.

Fonte: Finding Bug Inducing Commits from Failures.

RepresentThemAll: A Universal Learning Representation of Bug Reports.

Demystifying Exploitable Bugs in Smart Contracts.

Understanding and Detecting On-The-Fly Configuration Bugs.

Explaining Software Bugs Leveraging Code Structures in Neural Machine Translation.

Is It Enough to Recommend Tasks to Newcomers? Understanding Mentoring on Good First Issues.

From Organizations to Individuals: Psychoactive Substance Use By Professional Programmers.

On the Self-Governance and Episodic Changes in Apache Incubator Projects: An Empirical Study.

Socio-Technical Anti-Patterns in Building ML-Enabled Software: Insights from Leaders on the Forefront.

Moving on from the Software Engineers' Gambit: An Approach to Support the Defense of Software Effort Estimates.

Concrat: An Automatic C-to-Rust Lock API Translator for Concurrent Programs.

Triggers for Reactive Synthesis Specifications.

Using Reactive Synthesis: An End-to-End Exploratory Case Study.

Syntax and Domain Aware Model for Unsupervised Program Translation.

Developer-Intent Driven Code Comment Generation.

Data Quality Matters: A Case Study of Obsolete Comment Detection.

Revisiting Learning-based Commit Message Generation.

Commit Message Matters: Investigating Impact and Evolution of Commit Message Quality.

PILAR: Studying and Mitigating the Influence of Configurations on Log Parsing.

Did We Miss Something Important? Studying and Exploring Variable-Aware Log Abstraction.

On the Temporal Relations between Logging and Code.

How Do Developers' Profiles and Experiences Influence their Logging Practices? An Empirical Study of Industrial Practitioners.

When to Say What: Learning to Find Condition-Message Inconsistencies.

SemParser: A Semantic Parser for Log Analytics.

Badge: Prioritizing UI Events with Hierarchical Multi-Armed Bandits for Automated UI Testing.

Efficiency Matters: Speeding Up Automated Testing with GUI Rendering Inference.

CodaMosa: Escaping Coverage Plateaus in Test Generation with Pre-trained Large Language Models.

Taintmini: Detecting Flow of Sensitive Data in Mini-Programs with Static Taint Analysis.

AChecker: Statically Detecting Smart Contract Access Control Vulnerabilities.

Fine-grained Commit-level Vulnerability Type Prediction by CWE Tree Structure.

Silent Vulnerable Dependency Alert Prediction with Vulnerability Key Aspect Explanation.

Reusing Deep Neural Network Models through Model Re-engineering.

PYEVOLVE: Automating Frequent Code Changes in Python ML Systems.

DeepArc: Modularizing Neural Networks for the Model Maintenance.

Decomposing a Recurrent Neural Network into Modules for Enabling Reusability and Replacement.

CHRONOS: Time-Aware Zero-Shot Identification of Libraries from Vulnerability Reports.

Understanding the Threats of Upstream Vulnerabilities to Downstream Projects in the Maven Ecosystem.

SecBench.js: An Executable Security Benchmark Suite for Server-Side JavaScript.

On Privacy Weaknesses and Vulnerabilities in Software Systems.

Detecting Exception Handling Bugs in C++ Programs.

Learning to Boost Disjunctive Static Bug-Finders.

Detecting Isolation Bugs via Transaction Oracle Construction.

SmallRace: Static Race Detection for Dynamic Languages - A Case on Smalltalk.

"STILL AROUND": Experiences and Survival Strategies of Veteran Women Software Developers.

When and Why Test Generators for Deep Learning Produce Invalid Inputs: an Empirical Study.

Fuzzing Automatic Differentiation in Deep-Learning Libraries.

Lightweight Approaches to DNN Regression Error Reduction: An Uncertainty Alignment Perspective.

Revisiting Neuron Coverage for DNN Testing: A Layer-Wise and Distribution-Aware Criterion.

Code Review of Build System Specifications: Prevalence, Purposes, Patterns, and Perceptions.

Better Automatic Program Repair by Using Bug Reports and Tests Together.

CCTEST: Testing and Repairing Code Completion Systems.

KNOD: Domain Knowledge Distilled Tree Decoder for Automated Program Repair.

Rete: Learning Namespace Representation for Program Repair.

AI-based Question Answering Assistance for Analyzing Natural-language Requirements.

Strategies, Benefits and Challenges of App Store-inspired Requirements Elicitation.

Data-driven Recurrent Set Learning For Non-termination Analysis.

Compiling Parallel Symbolic Execution with Continuations.

Verifying Data Constraint Equivalence in FinTech Systems.

Tolerate Control-Flow Changes for Sound Data Race Prediction.

Fill in the Blank: Context-aware Automated Text Input Generation for Mobile GUI Testing.

Columbus: Android App Testing Through Systematic Callback Exploration.

GameRTS: A Regression Testing Framework for Video Games.

Autonomy Is An Acquired Taste: Exploring Developer Preferences for GitHub Bots.

Flexible and Optimal Dependency Management via Max-SMT.

Impact of Code Language Models on Automated Program Repair.

Tare: Type-Aware Neural Program Repair.

Template-based Neural Program Repair.

Automated Repair of Programs from Large Language Models.

Automated Program Repair in the Era of Large Pre-trained Language Models.

Faster or Slower? Performance Mystery of Python Idioms Unveiled with Empirical Evidence.

Usability-Oriented Design of Liquid Types for Java.

Towards Understanding Fairness and its Composition in Ensemble Machine Learning.

Fairify: Fairness Verification of Neural Networks.

Leveraging Feature Bias for Scalable Misprediction Explanation of Machine Learning Models.

Information-Theoretic Testing and Debugging of Fairness Defects in Deep Neural Networks.

Demystifying Privacy Policy of Third-Party Libraries in Mobile Apps.

Cross-Domain Requirements Linking via Adversarial-based Domain Adaptation.

On-Demand Security Requirements Synthesis with Relational Generative Adversarial Networks.

Measuring Secure Coding Practice and Culture: A Finger Pointing at the Moon is not the Moon.

What Challenges Do Developers Face About Checked-in Secrets in Software Artifacts?

Lejacon: A Lightweight and Efficient Approach to Java Confidential Computing on SGX.

Keyword Extraction From Specification Documents for Planning Security Mechanisms.

Dependency Facade: The Coupling and Conflicts between Android Framework and Its Customization.

Test Selection for Unified Regression Testing.

Measuring and Mitigating Gaps in Structural Testing.

Heterogeneous Anomaly Detection for Software Systems via Semi-supervised Cross-modal Attention.

Recommending Root-Cause and Mitigation Steps for Cloud Incidents using Large Language Models.

Eadro: An End-to-End Troubleshooting Framework for Microservices on Multi-source Data.

LogReducer: Identify and Reduce Log Hotspots in Kernel on the Fly.

Aries: Efficient Testing of Deep Neural Networks via Labeling-Free Accuracy Estimation.

CC: Causality-Aware Coverage Criterion for Deep Neural Networks.

Balancing Effectiveness and Flakiness of Non-Deterministic Machine Learning Tests.

Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems.

Reliability Assurance for Deep Neural Network Architectures Against Numerical Defects.

Demystifying Issues, Challenges, and Solutions for Multilingual Software Development.

Automated Summarization of Stack Overflow Posts.

Semi-Automatic, Inline and Collaborative Web Page Code Curations.

Identifying Key Classes for Initial Software Comprehension: Can We Do It Better?

Improving API Knowledge Discovery with ML: A Case Study of Comparable API Methods.

Evidence Profiles for Validity Threats in Program Comprehension Experiments.

Developers' Visuo-spatial Mental Model and Program Comprehension.

Two Sides of the Same Coin: Exploiting the Impact of Identifiers in Neural Code Comprehension.

SeeHow: Workflow Extraction from Programming Screencasts through Action-Aware Video Analytics.

AidUI: Toward Automated Recognition of Dark Patterns in User Interfaces.

Carving UI Tests to Generate API Tests and API Specification.

Ex pede Herculem: Augmenting Activity Transition Graph for Apps via Graph Convolution Network.

Sustainability is Stratified: Toward a Better Theory of Sustainable Software Engineering.

DLInfer: Deep Learning with Static Slicing for Python Type Inference.

ViolationTracker: Building Precise Histories for Static Analysis Violations.

Generating Test Databases for Database-Backed Applications.

Testing Database Engines via Query Plan Guidance.

Testing Database Systems via Differential Query Execution.

Analysing the Impact of Workloads on Modeling the Performance of Configurable Software Systems.

Twins or False Friends? A Study on Energy Consumption and Performance of Configurable Software.

Learning Deep Semantics for Test Completion.

SkCoder: A Sketch-based Approach for Automatic Code Generation.

An Empirical Comparison of Pre-Trained Models of Source Code.

On the Robustness of Code Generation Techniques: An Empirical Study on GitHub Copilot.

Source Code Recommender Systems: The Practitioners' Perspective.

Safe Low-Level Code Without Overhead is Practical.

Sibyl: Improving Software Engineering Tools with SMT Selection.

Coverage Guided Fault Injection for Cloud Systems.

Diver: Oracle-Guided SMT Solver Testing with Unrestricted Random Mutations.

An Empirical Study of Deep Learning Models for Vulnerability Detection.

DeepVD: Toward Class-Separation Features for Neural Network Vulnerability Detection.

Enhancing Deep Learning-based Vulnerability Detection by Building Behavior Graph Model.

Vulnerability Detection with Graph Simplification and Enhanced Graph Representation Learning.

Does data sampling improve deep learning-based vulnerability detection? Yeas! and Nays!

Incident-aware Duplicate Ticket Aggregation for Cloud Systems.

Large Language Models are Few-shot Testers: Exploring LLM-based General Bug Reproduction.

On the Reproducibility of Software Defect Datasets.

Context-aware Bug Reproduction for Mobile Apps.

Read It, Don't Watch It: Captioning Bug Recordings Automatically.

Duetcs: Code Style Transfer through Generation and Retrieval.

On the Applicability of Language Models to Block-Based Programs.

MTTM: Metamorphic Testing for Textual Content Moderation Software.

Metamorphic Shader Fusion for Testing Graphics Shader Compilers.

MorphQ: Metamorphic Testing of the Qiskit Quantum Computing Platform.

Log Parsing with Prompt-based Few-shot Learning.

An Empirical Study of Pre-Trained Model Reuse in the Hugging Face Deep Learning Model Registry.

ContraBERT: Enhancing Code Pre-trained Models via Contrastive Learning.

DStream: A Streaming-Based Highly Parallel IFDS Framework.

(Partial) Program Dependence Learning.

MirrorTaint: Practical Non-intrusive Dynamic Taint Tracking for JVM-based Microservice Systems.

VULGEN: Realistic Vulnerability Generation Via Pattern Mining and Deep Learning.

Compatible Remediation on Vulnerabilities from Third-Party Libraries for Java Projects.

Automated Black-Box Testing of Mass Assignment Vulnerabilities in RESTful APIs.

CoLeFunDa: Explainable Silent Vulnerability Fix Identification.

Finding Causally Different Tests for an Industrial Control System.

Doppelgänger Test Generation for Revealing Bugs in Autonomous Driving Software.

Generating Realistic and Diverse Tests for LiDAR-Based Perception Systems.

Rules of Engagement: Why and How Companies Participate in OSS.

An Empirical Study on Software Bill of Materials: Where We Stand and the Road Ahead.