Filtering out invalid entity references in Drupal 8

Today I was working on a custom Drupal 8 form where I needed an option to purge existing entities and their references on a parent entity before running an import. It seemed pretty straightforward until I saw "ghost" values persisting on the parent entity's inline entity form. Here's my journey down the rabbit hole to fix broken entity reference values.

The first thing I did was pretty straightforward. It's in a batch, so I grabbed a few entities and just deleted them, hoping magic would happen.

$limit = 25;   
$price_list_item_ids = array_slice($price_list->getItemsIds(), 0, $limit);
$price_list_items = $price_list_item_storage->loadMultiple($price_list_item_ids);
$price_list_item_storage->delete($price_list_items);

That didn't work. When I would view the price list the references still existed, albeit broken.

So then I tried to run the filter empty items method, thinking "well, the value is empty." Fingers crossed, I did the following:

$price_list->get('items')->filterEmptyItems();

No dice. So I dig in and see why this isn't working. The \Drupal\Core\Field\FieldItemList::filterEmptyItems method invokes \Drupal\Core\TypedData\Plugin\DataType\ItemList::filter and checks if the field item reports that it has an empty value.

  /**
   * {@inheritdoc}
   */
  public function filterEmptyItems() {
    $this->filter(function ($item) {
      return !$item->isEmpty();
    });
    return $this;
  }

Well, it should be empty, right? WRONG AGAIN. The EntityReferenceItem class doesn't care if the reference is valid or not, only if it has values in its properties. In my case, there was always a target_id set and sometimes $this->entity was populated from a previous access.

  /**
   * {@inheritdoc}
   */
  public function isEmpty() {
    // Avoid loading the entity by first checking the 'target_id'.
    if ($this->target_id !== NULL) {
      return FALSE;
    }
    if ($this->entity && $this->entity instanceof EntityInterface) {
      return FALSE;
    }
    return TRUE;
  }

So I took another approach. I checked to see if validating the field would return validation constraints and let me remove field values that way. Running validate gave me a list of all the broken references... except it was near impossible to traverse them and discover what field value delta was incorrect.

$price_list->get('items')->validate();

So then I backtracked a bit. I noticed that the filter method for the item list was public. The filterEmptyItems method wasn't working for me, so why not roll my own! And I did. And it failed.

$items->filter(function (EntityReferenceItem $item) {
  return !$item->validate()->count();
});

Why? Because the ValidReference constraint is on the entity reference list class and not on the entity reference item class itself. There were never any violations at the field value level, only for the entire list of values.

Since I couldn't trust validations, I could trust that the computed entity field would be null when it was passed through the entity adapter. And that's how I got to my final version which removes all invalid entity reference values:

// The normal filter on empty items does not work, because entity
// reference only cares if the target_id is set, not that it is a viable
// reference. This is only checked on the constraints. But constraints
// do not provide enough data. So we use a custom filter.
$price_list->get('items')->filter(function (EntityReferenceItem $item) {
  return $item->entity !== NULL;
});
$price_list->save();

Hopefully, this aids someone else going down a unique rabbit hole of programmatically removing entities and mending references.