永続的かつ不変の Java LinkedList-Java チュートリアル-php.cn

Persistent and Immutable Java LinkedList

この記事では、Java で LinkedList の永続的かつ不変のバリエーションを実装します
部分的な構造共有により、時間と空間の効率が向上します

導入

リンクリストとは何ですか

リンクリストは、各ノードに値とシーケンス内の次のノードへの参照が含まれるノードのコレクションで構成されるデータ構造です。リストの先頭に要素を追加したり、先頭から要素を削除したりするような操作は O(1) 操作です。ただし、リストの末尾に要素を追加したり、末尾から要素を削除したりするような操作は O(n) 操作です。ここで、n はリスト内の要素の数です。

不変の LinkedList が必要な理由

関数型プログラミングでは、不変性が重要な概念です。不変性とは、データ構造が一度作成されると、変更することはできません。代わりに、変更を加えて新しいデータ構造が作成され、元のデータ構造は変更されません。

このプロパティにより、いくつかの利点が得られます:

スレッドセーフ: データ構造は不変であるため、同期を必要とせずに複数のスレッド間で共有できます。
: データ構造は不変であるため、いつでもデータ構造の状態を推論できます。
: データ構造は不変であるため、以前のバージョンのデータ構造を使用していつでも前の状態に戻すことができます。
: データ構造は変更できないため、不変性によりデバッグが容易になります。

コレクションが設計されたときの思い付きではないことから、本質的なパフォーマンス上の理由まで、

不変のデータ構造

変更不可能な JDK コレクションとこの LinkedList の違い

JDK は、元のコレクションのラッパーである変更不可能なコレクションを提供します。これらは不変性をサポートしますが、永続的ではなく、真の型で安全な方法を提供しません。これらはオリジナルのコレクションのラッパーであり、

変更操作が呼び出されたときに例外をスローします。これは、実行時に UnsupportedOperationException を取得することが困難であることを保証しながら、データ構造をまったく変更できない不変性と同じではありません。

永続的 vs 不変的

永続的と不変という用語はしばしば同じ意味で使用されますが、意味は異なります。不変性ではデータ構造の変更は許可されませんが、永続性ではデータ構造が変更された場合にその構造を共有できます。これは、データ構造が変更されるとき、つまり新しいバージョンが作成されるときに、古いデータ構造の一部を新しいデータ構造と共有でき、時間とスペースの効率が向上することを意味します。この手法は、

構造共有

と呼ばれます。データ構造の永続性を実現するには複数の方法があります。データ構造は、AVL や赤黒ツリーなどのバランスツリーを使用する単純なものから複雑なもの、フィンガーツリーや基数ベースのバランスツリーなどのより複雑なものまで多岐にわたります。

この記事では、部分的な構造共有を備えた永続的かつ不変の LinkedList のより単純なバージョンを実装します。その理由は、LinkedList は単純なデータ構造であり、不変性と永続性の概念をより深く理解するのに役立ち、通常、より複雑なデータ構造の実装は本質的に困難な作業であるためです。

実装

以下では、段階的なアプローチを使用して、Java で永続的かつ不変の単独の LinkedList を実装します。

Java への関数型プログラミングのツアーを支援する完全な実装と追加のモナドとユーティリティについては、この素晴らしい小さなライブラリ FunctionalUtils をチェックしてください。

LinkedList に付ける名前は SeqList となり、汎用クラスになります。

最初に、リストでサポートする操作について考える必要があります。

O(1) 操作となるリストの先頭に追加します。

リストからの要素の削除。要素が末尾の方にある場合、最悪でも O(n) 操作になります。

リスト内の任意の位置に追加します。
述語を指定して要素をフィルターイン/フィルターアウトするフィルター操作。
関数の合成を容易にするためにリストをモナドに変換するマップ操作とフラットマップ操作

head holding a value.
tail holding the rest of the list which in turn is a LinkedList consisting of head and tail until the end of the list.
The end of the list is represented by an empty LinkedList which means that both head and tail are null.

The full implementation can be found in here

Given that the last element of the list is an empty LinkedList and each element is a node with a head and a tail, we can represent our LinkedList as a recursive data structure consisting of two classes:

public record Empty() implements SeqList {
}

public record Cons(T head, SeqList tail) implements SeqList {
}

ログイン後にコピー

where Cons is a functional programming term named Construct dates back to the Lisp programming language.

Given the above, we can implement the SeqList interface as follows:

public sealed interface SeqList permits Empty, Cons {
  /**
   * Creates an empty list.
   *
   * @param  the type of the elements
   * @return an empty list
   */
  static  SeqList empty() {
    return new Empty<>();
  }

  /**
   * Creates a new list with the given elements. The complexity of this method is O(n) where n is
   * the number of elements.
   *
   * @param elements the elements to add
   * @param       the type of the elements
   * @return a new list with the elements added
   */
  @SafeVarargs
  static  SeqList of(T... elements) {
    SeqList list = empty();
    for (int i = elements.length - 1; i >= 0; i--) {
      list = list.add(elements[i]);
    }
    return list;
  }

  /**
   * Prepends the element to the list. The complexity of this method is O(1).
   *
   * @param element the element to add
   * @return a new list with the element prepended
   */
  default SeqList add(T element) {
    return new Cons<>(element, this);
  }
}

ログイン後にコピー

Let's break down what we wrote above:

We created a sealed interface SeqList which is going to be the interface for our LinkedList.
The method empty() creates an empty list which is an instance of the class Empty.
The method add() prepends an element to the list. The complexity of this method is O(1) as we are just creating a new node with the given element and the current list. This method uses Structural Sharing as the new list shares the tail of the current list.
The method of() creates a new list with the given elements. The complexity of this method is O(n) where n is the number of elements. As it is evident we start from the last element and add it to the list. This is because we want to preserve the order of the elements.

We need to implement the remaining of our operations. Let's start with the remove operation:

/**
   * Removes the first occurrence of the element from the list. If the element is not found, the
   * list is returned as is. The complexity of this method is O(n) where n is the number of
   * elements. It uses structural sharing up to the element to remove. If the element is not found
   * the structural sharing is not utilized.
   *
   * @param element the element to remove
   * @return a new list with the element removed
   * @throws StackOverflowError for infinite lists
   */
  default SeqList remove(T element) {
    if (isEmpty()) {
      return this;
    }
    if (head().equals(element)) {
      return tail();
    }
    return new Cons<>(head(), tail().remove(element));
  }

ログイン後にコピー

And additionally implement the tail() method and some other useful ones in our subclasses:

public record Cons(T head, SeqList tail) implements SeqList {

  @Override
  public boolean isEmpty() {
    return false;
  }

  @Override
  public Optional headOption() {
    return Optional.ofNullable(head);
  }

  @Override
  public Optional last() {
    if (tail.isEmpty()) {
      return Optional.ofNullable(head);
    }
    return tail.last();
  }
}

public record Empty() implements SeqList {

  @Override
  public boolean isEmpty() {
    return true;
  }

  @Override
  public T head() {
    throw new UnsupportedOperationException("head() called on empty list");
  }

  @Override
  public Optional headOption() {
    return Optional.empty();
  }

  @Override
  public SeqList tail() {
    throw new UnsupportedOperationException("tail() called on empty list");
  }

  @Override
  public Optional last() {
    return Optional.empty();
  }
}

ログイン後にコピー

As we can examine from the implementation of the remove method we are using recursive calls to remove the element from
the list. This is a typical pattern in functional programming where we are using recursion to traverse the list and
remove the element. Care should be taken to avoid stack overflows in case of infinite lists. A future improvement could be to use tail recursion optimization which is not supported in Java but could be achieved using trampolining.

Lastly, let's implement the map and flatMap operations to turn our List into a Monad:

/**
   * Applies a map function to the elements of the list. The complexity of this method is O(n) where
   * n is the number of elements.
   * It does not use structural sharing as it requires advanced data structures to achieve
   * it.
   *
   * @param fn  the map function
   * @param  the type of the elements of the new list
   * @return a new list with the elements mapped
   * @throws StackOverflowError for infinite lists
   */
  default  SeqList map(Function fn) {
    if (isEmpty()) {
      return empty();
    }
    return new Cons<>(fn.apply(head()), tail().map(fn));
  }

  /**
   * Applies a flat map function to the elements of the list. The complexity of this method is O(n)
   * where n is the number of elements.
   * It does not use structural sharing as it requires advanced data structures to achieve
   * it.
   *
   * @param fn  the flat map function
   * @param  the type of the elements of the new list
   * @return a new list with the elements flat mapped
   * @throws StackOverflowError for infinite lists
   */
  default  SeqList flatMap(Function> fn) {
    if (isEmpty()) {
      return empty();
    }
    SeqList mappedHead = fn.apply(head());
    SeqList newTail = tail().flatMap(fn);
    return concat(mappedHead, newTail);
  }

  /**
   * Concatenates two lists. The complexity of this method is O(n) where n is the number of
   * elements.
   *
   * @param list1 the first list
   * @param list2 the second list
   * @param    the type of the elements
   * @return a new list with the elements of the two lists concatenated
   * @throws StackOverflowError for infinite lists
   */
  static  SeqList concat(SeqList list1, SeqList list2) {
    if (list1.isEmpty()) {
      return list2;
    }
    return new Cons<>(list1.head(), concat(list1.tail(), list2));
  }

ログイン後にコピー

As we can see from the implementation of the map and flatMap methods we are using recursive calls to traverse the list and apply the function to each element. The flatMap method is a bit more complex as it requires the function to return a new list which we need to concatenate with the rest of the list. Both methods are not using structural sharing due to its notoriety difficulty and the importance of using advanced data structures. A future improvement will be examined in a future article.

Usage examples

Let's see some usage examples of our SeqList.

Imagine we have a list of integers and we want to filter out the even numbers and then multiply them in the power of two but with immutability and persistence.

SeqList list = SeqList.of(1, 2, 3, 4, 5, 6);
SeqList updatedList = list
   .filterOut(number -> number % 2 == 0)
   .map(number -> Math.pow(number, 2));

ログイン後にコピー

Imagine we have a list of strings and we want to concatenate them with a prefix and a suffix.

SeqList list = SeqList.of("a", "b", "c", "d", "e");
SeqList updatedList = list
   .map(letter -> "prefix" + letter + "suffix");

ログイン後にコピー

Imagine we have a list of lists and we want to flatten them.

SeqList> list = SeqList.of(SeqList.of(1, 2), SeqList.of(3, 4), SeqList.of(5, 6));
SeqList updatedList = list
   .flatMap(seqList -> seqList);

ログイン後にコピー

Another example is pattern matching using JDK 21 switch expressions and taking advantage of the compiler checks.

SeqList list = SeqList.of(1, 2, 3, 4, 5, 6);
switch (list) {
  case Empty() -> {
    // do something
  }
  case Cons cons -> {
    //do something else
  }
}

ログイン後にコピー

Disadvantages

Performance: If the list is used primarily for prepending elements of getting elements from the head of the list then the performance is good. In all other cases, O(n) is required at least with this implementation.
Complexity: The implementation of a persistent and immutable LinkedList is more complex than its mutable counterpart.
Memory: The implementation of a persistent and immutable LinkedList requires more memory than its mutable counterpart due to the creation of new lists for each operation. With structural sharing this is alleviated but not eliminated.

Conclusion

In this article, we implemented a persistent and immutable LinkedList in Java with partial structural sharing. We demonstrated the benefits of immutability and persistence and how we can achieve them in Java. We also showed how we can turn our LinkedList into a Monad for easier function composition. We discussed the advantages and disadvantages of persistent and immutable data structures and how they can be used in practice.

以上が永続的かつ不変の Java LinkedListの詳細内容です。詳細については、PHP 中国語 Web サイトの他の関連記事を参照してください。