AEM Mocks Build fail sporadically with "Unable to inject mandatory reference" error

Description

from time to time builds for AEM Mocks are failing with errors like:

it's not always the same service that is affected. most times it's one of adapterManager, eventAdmin or both.

recent examples:

seems to be a racing condition or a concurrency problem. we are running the junit 5 tests in parallel (see parent_toplevel config).

Activity

Show:

Stefan Seifert December 7, 2021 at 11:59 AM

i think i found the root cause issue and fixed it in https://issues.apache.org/jira/browse/SLING-10973

the MockServiceRegistration class used an integer field to generate a unique service ID (incrementing long) for each new service registration. this id was also used as comparison/equals key in the hash set for storing the actual service registrations in the bundle context.

this ID was generated not in a thread-safe manner. this is normally no problem as all operations in osgi-mock are done synchronously (except sending OSGi events). but as we seen in the log above, the sling component org.apache.sling.resourceresolver.impl.ResourceAccessSecurityTracker seems to launch a separate thread (i suppose it's an executor service) to register additional sling mappings. directly after this, the StringInterpolationProvider was provided.

so it may happen that two service registrations (e.g. a service user mapping and the string interpolation provider) get the exactly same service ID, and so one of them is omitted when adding to the set of service registrations.

Stefan Seifert December 7, 2021 at 11:08 AM

first time i managed to capture a run with logging about a missing StringInterpolationProvider.
from the log it is seen that it is actually registered, but not found as service reference directly afterwards.

Stefan Seifert December 7, 2021 at 9:20 AM
Edited

i've disocvered that this problem still occurs.

strange thing is that it does not occur when DEBUG logging for osgi-mock is enabled.
it does occur in 30-50% of aem-mock test runs when DEBUG logging is disabled for osgi-mock (does not matter if DEBUG loggin for sling-mock/aem-mock is enabled or not).

i already did some thorough analyses of the actual debug log statements and ifDebugEnabled blocks in osgi-mock and tested with several variations on a branch - it's very spooky, one single debug line makes the difference - if this line is present all tests run fine - if not they fail with high probability:
https://github.com/apache/sling-org-apache-sling-testing-osgi-mock/blob/17245c3847ee00d824c53d35773ba051076f531b/core/src/main/java/org/apache/sling/testing/mock/osgi/MockBundleContext.java#L131

the actual occurences i've seen are:

  • 'stringInterpolationProvider' (org.apache.sling.resourceresolver.impl.mapping.StringInterpolationProvider) for class org.apache.sling.resourceresolver.impl.ResourceResolverFactoryActivator (for AEM Cloud and AEM 6.11)

  • 'resourceAccessSecurityTracker' (org.apache.sling.resourceresolver.impl.ResourceAccessSecurityTracker) for class org.apache.sling.resourceresolver.impl.ResourceResolverFactoryActivator (for AEM 6.5.0 where stringInterpolationProvider is not present)

Stefan Seifert November 29, 2021 at 3:28 PM

it seem that SLING-10944 was the root cause for this issue. so ti should be fixed with next release for AEM Mocks 4.1.6.

i also enabled again the parallel execution of unit tests for the aem-mock project itself - it runs fine as well (and despite previous observations seems to increase the test run in average a bit by ~20-25%).

Stefan Seifert November 25, 2021 at 7:36 PM

collections of services that were expected but missing for injection in the various failed test runs:

  • 'adapterManager' (org.apache.sling.api.adapter.AdapterManager) for class org.apache.sling.models.impl.ModelAdapterFactory

  • 'eventAdmin' (org.osgi.service.event.EventAdmin) for class io.wcm.testing.mock.aem.dam.MockAemDamAdapterFactory

  • 'resourceAccessSecurityTracker' (org.apache.sling.resourceresolver.impl.ResourceAccessSecurityTracker) for class org.apache.sling.resourceresolver.impl.ResourceResolverFactoryActivator

  • 'stringInterpolationProvider' (org.apache.sling.resourceresolver.impl.mapping.StringInterpolationProvider) for class org.apache.sling.resourceresolver.impl.ResourceResolverFactoryActivator

Fixed

Details

Assignee

Reporter

Components

Priority

Created November 19, 2021 at 3:20 PM
Updated December 7, 2021 at 2:01 PM
Resolved December 7, 2021 at 2:01 PM