Relevant for: Workspace administrators, administrators (see "User roles") You have clicked on “Insights” in the header of the administration interface, selected filter options, and clicked on “Details” under “Benchmark - Clusters and Competencies by Exercise”.
Here you receive an exercise-specific evaluation of the ratings for the selected competency, across all assessments, candidates, and observers. A distinction is made between the competency level and the behavior anchors level. The Insights provide various statistics for each individual exercise in which the selected competency and the behavior anchors were evaluated.
At the top, you first see the competency you selected to view the exercise-specific details. Next to it, the competency clusters to which the selected competency is assigned is displayed in grey text.
If you have not stored any competency clusters, you will only see the name of the selected competency here.
Below, you will see the respective exercise with the details at the competency level and the behavior anchors level. This structure is repeated for all exercises in which the chosen competency was observed and evaluated.
Statistics - Competency
Under “Statistics - Competency” you see three statistical metrics of the ratings given for the entire competency:
- Agreement
- Observability
- Internal consistency
The metrics are explained in detail further down in this article. You can also click on the “Info-i” next to the statistics at any time to display an explanation.
Statistics - Behavior Anchors
Under “Statistics - Behavior Anchors” you see various statistical metrics of the ratings given for the individual behavior anchors of a competency:
- Mean
- Standard deviation
- Agreement
- Observability
Internal consistency is not calculated here as it is not very meaningful in this context.
On the left side of the table, the “behavior anchors” are listed individually, which were observed and evaluated in the assessments in the exercises and the selected competency. In the header, the displayed statistical metrics are listed individually, which were conducted and observed in the assessments. In the table, you see the values of each individual behavior anchor of a competency in an exercise (competency- and exercise-specific).
The following explains the mentioned metrics in detail.
They are also colour-coded in the Insights, depending on how good the values are. This is intended to give you a rough guideline for interpreting the results.
- A green marking stands for a good expression,
- a yellow marking indicates significant potential for improvement, and
- an orange marking suggests that the competencies or behavior anchors should be revised.
Agreement
In the Applysia Insights, observer agreement is automatically calculated (for all statisticians: the intraclass correlation) – both for the ratings of all competencies and for the ratings of the behavior anchors.
Agreement is thus a measure of the extent to which the ratings of different observers, e.g., for a behavior anchor, coincide. The differences are averaged across candidates to ensure that only general tendencies and not individual profiles influence the data. If an anchor has a very low agreement, for example, this could indicate that it is not clearly formulated enough and thus leads to misunderstandings among the observers about what exactly is to be rated. This can result in observers giving very different ratings, even though they observed the candidate at the same time. A high level of agreement is desirable here.
In short: How high is the agreement (of the ratings) among the different observers?
Observability
Observability provides information on whether a competency or a behavior anchor can be observed at all in an exercise. If, for example, half of all observers do not rate a behavior anchor in an exercise, this could indicate that this behavior anchor often cannot be (clearly) observed in this exercise. This is a good starting point to specifically evaluate the design of the exercises, revise them if necessary, and then clearly assess the desired competencies. The goal here should be a 100% expression.
In short: Were all competencies and behavior anchors rated by all observers?
Internal consistency
Internal consistency (for all statisticians: here Cronbach’s Alpha) helps to assess the extent to which the behavior anchors are suitable for assessing a coherent competency. A high value means that the anchors are strongly interrelated and thus probably suitable for assessing the same competency. If the value is low, at least one of the anchors probably does not fit with the others. This can be particularly the case with rather broad competency classes and “combined” competencies such as “Strategy and action competency”. In principle, lower values do not mean that the behavior anchors are “bad”, but rather that one should look more closely here, as it might make sense to split the competency.
In short: Do the individual behavior anchors fit together and measure the same competency?