ICSE 2020

129 papers accepted.

Updated on 2023-09-08.

You can find the lastest information here.


Learning-to-rank vs ranking-to-learn: strategies for regression testing in continuous integration.

A cost-efficient approach to building in continuous integration.

Practical fault detection in puppet programs.

Learning from, understanding, and supporting DevOps artifacts for docker.

Adapting requirements models to varying environments.

Comparing formal tools for system design: a judgment study.

Debugging inputs.

Causal testing: understanding defects' root causes.

Impact analysis of cross-project bugs on software ecosystems.

Taming behavioral backward incompatibilities via cross-project testing and analysis.

Watchman: monitoring dependency conflicts for Python library ecosystem.

One size does not fit all: a grounded theory and online survey study of developer preferences for security warning types.

Schrödinger's security: opening the box on app developers' security rationale.

How software practitioners use informal local meetups to share software engineering knowledge.

Predicting developers' negative feelings about code review.

Near-duplicate detection in web app model inference.

Extracting taint specifications for JavaScript libraries.

SLACC: simion-based language agnostic code clones.

Finding client-side business flow tampering vulnerabilities.

Securing unsafe rust programs with XRust.

Is rust used safely by software developers?

Burn after reading: a shadow stack with microsecond-level runtime rerandomization for protecting return addresses.

SAVER: scalable, precise, and safe memory-error repair.

Revealing injection vulnerabilities by leveraging existing tests.

RoScript: a visual script driven truly non-intrusive robotic testing system for touch screen applications.

Translating video recordings of mobile app usages into replayable scenarios.

Unblind your apps: predicting natural-language labels for mobile GUI components by deep learning.

DeepBillboard: systematic physical-world testing of autonomous driving systems.

Misbehaviour prediction for autonomous driving systems.

Approximation-refinement testing of compute-intensive cyber-physical models: an approach based on system identification.

A comprehensive study of autonomous vehicle bugs.

Studying the use of Java logging utilities in the wild.

A study on the prevalence of human values in software engineering publications, 2015 - 2018.

Explaining pair programming session dynamics from knowledge gaps.

Engineering gender-inclusivity into software: ten teams' tales from the trenches.

How has forking changed in the last 20 years?: a study of hard forks on GitHub.

Multiple-entry testing of Android applications by constructing activity launching contexts.

ComboDroid: generating high-quality test inputs for Android apps via use case combinations.

Time-travel testing of Android apps.

HeteroRefactor: refactoring for heterogeneous computing with FPGA.

HARP: holistic analysis for refactoring Python-based analytics programs.

CC2Vec: distributed representations of code changes.

Empirical review of automated analysis tools on 47, 587 Ethereum smart contracts.

Gap between theory and practice: an empirical study of security patches in solidity.

An investigation of cross-project learning in online just-in-time software defect prediction.

Understanding the automated parameter optimization on transfer learning for cross-project defect prediction: an empirical study.

Software visualization and deep transfer learning for effective software defect prediction.

Software documentation: the practitioners' perspective.

DLFix: context-based code transformation learning for automated program repair.

On the efficiency of test suite based program repair: A Systematic Assessment of 16 Automated Repair Systems for Java Programs.

Caspar: extracting and synthesizing user stories of problems from app reviews.

Detection of hidden feature requests from massive chat messages via deep siamese network.

A tale from the trenches: cognitive biases and software development.

Recognizing developers' emotions while programming.

Neurological divide: an fMRI study of prose and code writing.

Here we go again: why is it difficult for developers to learn another programming language?

Importance-driven deep learning system testing.

ReluDiff: differential verification of deep neural networks.

Dissector: input validation for deep learning applications by crossing-layer dissection.

Towards characterizing adversarial defects of deep learning software from the lens of uncertainty.

Gang of eight: a defect taxonomy for infrastructure as code scripts.

MemLock: memory usage guided fuzzing.

sFuzz: an efficient adaptive fuzzer for solidity smart contracts.

Targeted greybox fuzzing with static lookahead analysis.

Planning for untangling: predicting the difficulty of merge conflicts.

Conquering the extensional scalability problem for value-flow analysis frameworks.

Tailoring programs for static analysis via program transformation.

Pipelining bottom-up data flow analysis.

A novel approach to tracing safety requirements and state-based design models.

How Android developers handle evolution-induced API compatibility issues: a large-scale study.

An empirical study on API parameter rules.

When APIs are intentionally bypassed: an exploratory study of API workarounds.

Demystify official API usage directives with crowdsourced API misuse scenarios, erroneous code examples and patches.

Simulee: detecting CUDA synchronization bugs via memory-access modeling.

White-box fairness testing through adversarial sampling.

Structure-invariant testing for machine translation.

Automatic testing and improvement of machine translation.

TRADER: trace divergence analysis and embedding regulation for debugging recurrent neural networks.

Typestate-guided fuzzer for discovering use-after-free vulnerabilities.

JVM fuzzing for JIT-induced side-channel detection.

Ankou: guiding grey-box fuzzing towards combinatorial difference.

BCFA: bespoke control flow analysis for CFA at scale.

On the recall of static call graph construction in practice.

Heaps'n leaks: how heap snapshots improve Android taint analysis.

Big code != big vocabulary: open-vocabulary models for source code.

Improving data scientist efficiency with provenance.

Managing data constraints in database-backed web applications.

Taxonomy of real faults in deep learning systems.

Testing DNN image classifiers for confusion & bias errors.

Repairing deep neural networks: fix patterns and challenges.

Fuzz testing based data augmentation to improve robustness of deep neural networks.

An empirical study on program failures of deep learning jobs.

Primers or reminders?: the effects of existing review comments on code review.

Mitigating turnover with code review recommendation: balancing expertise, workload, and knowledge distribution.

How do companies collaborate in open source ecosystems?: an empirical study of OpenStack.

How to not get rich: an empirical study of donations in open source.

Scaling open source communities: an empirical study of the Linux kernel.

SpecuSym: speculative symbolic execution for cache timing leak detection.

Symbolic verification of message passing interface programs.

Efficient generation of error-inducing floating-point inputs via symbolic execution.

HyDiff: hybrid differential software analysis.

Seenomaly: vision-based linting of GUI animation effects against design-don't guidelines.

Low-overhead deadlock prediction.

An empirical assessment of security risks of global Android banking apps.

Accessibility issues in Android apps: state of affairs, sentiments, and ways forward.

Collaborative bug finding for Android apps.

POSIT: simultaneously tagging natural and programming languages.

CPC: automatically classifying and propagating natural language comments via program analysis.

Suggesting natural method names to check name consistencies.

Retrieval-based neural source code summarization.

On learning meaningful assert statements for unit test cases.

Quickly generating diverse valid test inputs with reinforcement learning.

An evidence-based inquiry into the use of grey literature in software engineering.

Towards the use of the readily available tests from the release pipeline as performance tests: are we there yet?

Verifying object construction.

Automatically testing string solvers.

A study on the lifecycle of flaky tests.

Testing file system implementations on layered models.

Co-evolving code with evolving metamodels.

Lazy product discovery in huge configuration spaces.

Reducing run-time adaptation space via analysis of possible utility bounds.

Context-aware in-process crowdworker recommendation.

A large-scale empirical study on vulnerability distribution within projects and the lessons learned.

Unsuccessful story about few shot malware family classification and siamese network to the rescue.

How does misconfiguration of analytic services compromise mobile privacy?

Interpreting cloud computer vision pain-points: a mining study of stack overflow.