ICCMA 2025 consists of four tracks:
Each track is composed of multiple sub-tracks, defined by a combination of a reasoning problem and an argumentation semantics.
Argumentation systems (solvers) can be submitted for evaluation into any choice of sub-tracks, i.e., there is no requirement to support e.g. all semantics for a specific reasoning problem, or all reasoning problems for a specific semantics.
NEW FOR 2025: For each (sub-)track (except No-Limits Track), an argumentation system can additionally take part in a fixed-SAT solver comparison. A participating solver is then ranked also in separate list with only other solvers taking part in this comparison (in addition to the normal ranking). Note the special requirements for taking part of this comparison.
NEW FOR 2025: In ICCMA 2025 we narrowed the list of semantics to some degree. Note that the stage semantics is not planned to be part of ICCMA 2025.
Note: The focus of the Main Track is to evaluate sequential core argumentation reasoning engines available in open source. Systems combining different core reasoning engines e.g. via portfolio-style techniques, systems employing parallel computations via the use of multiple processor cores, as well as systems which will not be made available in open source are invited to the special No-Limits Track which consists of the same subtracks as the Main Track. The organizers reserve the right to move Main Track submissions to No-Limits based on case-by-case analysis.
We recall the definition of Dung's Abstract Argumentation Frameworks (AFs) [Dung 95] and its semantics.
An AF is a directed graph F = (A,R) where A is a set of arguments, and R ⊆ A × A is the attack relation. For a,b ∈ A, we say that a attacks b when (a,b) ∈ R. If in addition b attacks c ∈ A, then a defends c against b. The same concepts are extended to sets of arguments: S ⊆ A attacks (respectively defends) an argument b ∈ A if there is some a ∈ S that attacks (respectively defends) b.
The relevant semantics of AFs are defined as follows. Let the range of S ⊆ A be S {a ∈ A | S attacks a}. A set S ⊆ A is
The semantics included in the Main Track are complete (CO), preferred (PR), stable (ST), semi-stable (SST), and ideal (ID).
The reasoning problems included in the Main Track are
Each subtrack (i.e. semantics and reasoning mode combination) is ranked separately. The time limit is 1200 seconds CPU time per instance, and PAR-2 scoring is used. That is, the score for a given solver on an instance is 2 * 1200 if the solver timed out on this instance, and otherwise the CPU running time of the solver on this instance in seconds. The score for a solver on a subtrack is the sum of the solver’s scores over all instances of the subtrack. The winner is the solver with the lowest score.
The No-Limits Track is a more permissible version of the Main Track and has the same problems as subtracks. In particular, solvers are allowed to run on multiple cores and can be portfolio-based (i.e. combine the usage of several solvers). The ranking is otherwise the same as for the Main Track, but wall-clock time is used instead of CPU time.
Abstract argumentation, as defined above. Correctness requirements and ranking are different than other tracks: incorrect solutions are simply discarded and only the number of correct solutions is taken into account. The time limit for the track is lower than for the other tracks, namely 60 seconds CPU time per instance.
Semantics: CO, PR, ST, SST, and ID. Reasoning modes: DC-σ and DS-σ. Each of the following combination of semantics and reasoning mode is a subtrack:
Each subtrack (i.e. semantics and reasoning mode combination) is ranked separately. The solver with the largest amount of correctly solved instances within a time limit of 60 seconds wins. If needed, cumulative CPU running time over solved instances is used as a tie-breaker.
Abstract argumentation, as defined above. Dynamic changes to an initial AF and acceptance queries are issued by different applications via IPAFAIR, an API for incremental reasoning in abstract argumentation. Please see ipafair.py in the repository for more details.
The subtracks in the Dynamic Track are DC-CO, DS-PR, DC-ST, and DS-ST. Each subtrack consists of applications which call an AF solver which implements the interface, applying changes to the underlying AF between solver calls. The name of the subtrack determines the allowed reasoning task and semantics. For example, in the DS-PR subtrack, the AF solver is initialized using the preferred semantics, and only solve_skept() calls are allowed with a single query argument. Both the query argument and the underlying AF may change between these calls.
Each subtrack (i.e. semantics and reasoning mode combination) is ranked separately. The time limit is 1200 seconds CPU time per instance, and PAR-2 scoring is used. That is, the score for a given solver on an instance is 2 * 1200 if the solver timed out on this instance, and otherwise the CPU running time of the solver on this instance in seconds. The score for a solver on a subtrack is the sum of the solver’s scores over all instances of the subtrack. The winner is the solver with the lowest score.
Assumption-based Argumentation (ABA) [Bondarenko et al 97] and the corresponding semantics are defined as follows.
An ABA framework is a tuple F = (L,R,A,‾) where
A sentence a ∈ L is derivable from a set X ⊆ A via rules R, denoted by X ⊢ a, if a ∈ X or there is a sequence of rules (r1,...,rn) such that head(rn) = a and for each rule ri we have ri ∈ R and each sentence in the body of ri is derived from rules earlier in the sequence or in X. A set of assumptions A1 attacks a set of assumptions A2 if the contrary of some a ∈ A2 is derivable from A1. A set of assumptions A1 defends an assumption a if A1 attacks every set of assumption that attacks a.
The semantics for an ABA framework F can be defined as follows. Given a set of assumptions X ⊆ A,
Semantics: CO, PR, ST.
The reasoning tasks for the ABA Track are
Each of the following combination of semantics and reasoning mode is a subtrack:
Each subtrack (i.e. semantics and reasoning mode combination) is ranked separately. The time limit is 1200 seconds CPU time per instance, and PAR-2 scoring is used. That is, the score for a given solver on an instance is 2 * 1200 if the solver timed out on this instance, and otherwise the CPU running time of the solver on this instance in seconds. The score for a solver on a subtrack is the sum of the solver’s scores over all instances of the subtrack. The winner is the solver with the lowest score.
Any solver can be marked for taking part in the Fixed-SAT-Solver comparison. Then the solver will be evaluated normally and, additionally, for each (sub-)track (except No-Limit) all participating solvers in the Fixed-SAT-Solver comparison will be compared separately. This is meant to evaluate solvers relying heavily on SAT solvers in a more uniform comparison. Otherwise, the choice of different SAT solvers might lead to differences in performance, making fair comparisons of such solvers difficult.
If a solver takes part in the Fixed-SAT-Solver comparison, the solver must implement the IPASIR interface for incremental SAT solvers. We will provide an example later on. The solver will then interact with a SAT solver only with the IPASIR interface and be compiled with a specific SAT solver for ICCMA 2025. Taking part in the Fixed-SAT-Solver comparison excludes then use of any other search engine similar to SAT solvers (like answer set programming). Usage of a SAT solver is limited to usage of the IPASIR interface (i.e., the solver may only use the interface to delegate computationally intenstive computations). The organizers reserve the right to remove submissions to the Fixed-SAT-Solver comparison based on a case-by-case analysis.
[Dung 95] P. M. Dung, On the Acceptability of Arguments and its Fundamental Role in Nonmonotonic Reasoning, Logic Programming and n-Person Games. Artif. Intell. 77(2): 321-358 (1995)
[Bondarenko et al 97] A. Bondarenko, P. M. Dung, R. Kowalski, F. Toni, An Abstract, Argumentation-Theoretic Approach to Default Reasoning. Artif. Intell. 93: 63-101 (1997)