0
votes

I'm battling to configure Apache Ignite to distribute partitions in zone-aware manner. I have Ignite 2.8.0 with 4 nodes running as StatefulSet pods in GKE 1.14 split in two zones. I followed the guide, and the example:

  • Propagated zone names into pod under AVAILABILITY_ZONE env var.
  • Then using Web-Console I verified that this env var was loaded correctly for each node.
  • I setup cache template in node XML config as in the below and created a cache from it using GET /ignite?cmd=getorcreate&cacheName=zone-aware-cache&templateName=zone-aware-cache (I can't see affinityBackupFilter settings in UI, but other parameters from the template got applied, so I assume it worked)

To simplify verification of partition distribution, I the partition number is set to just 2. After creating the cache I observed the following partition distribution:

enter image description here

Then I mapped nodes ids to values in AVAILABILITY_ZONE env var, as reported by nodes, with the following results:

AA146954 us-central1-a
3943ECC8 us-central1-c
F7B7AB67 us-central1-a
A94EE82C us-central1-c

As one can easily see, partition 0 pri/bak resides on nodes 3943ECC8 and A94EE82C which both are in the same zone. What am I missing to make it work?

Another odd thing, is then specifying partition number to be low (e.g. 2 or 4), only 3 out of 4 nodes are used). When using 1024 partitions, all nodes are utilized, but the problem still exists - 346 out of 1024 partitions had their primary/backup colocated in the same zone.

Here is my node config XML:

<beans xmlns="http://www.springframework.org/schema/beans"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="
  http://www.springframework.org/schema/beans
  http://www.springframework.org/schema/beans/spring-beans.xsd">

  <bean class="org.apache.ignite.configuration.IgniteConfiguration">

    <!-- Enabling Apache Ignite Persistent Store. -->
    <property name="dataStorageConfiguration">
      <bean class="org.apache.ignite.configuration.DataStorageConfiguration">
        <property name="defaultDataRegionConfiguration">
          <bean class="org.apache.ignite.configuration.DataRegionConfiguration">
            <property name="persistenceEnabled" value="true"/>
          </bean>
        </property>
      </bean>
    </property>

    <!-- Explicitly configure TCP discovery SPI to provide list of initial nodes. -->
    <property name="discoverySpi">
      <bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
        <property name="ipFinder">
          <!-- Enables Kubernetes IP finder and setting custom namespace and service names.  -->
          <bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.kubernetes.TcpDiscoveryKubernetesIpFinder">
            <property name="namespace" value="ignite"/>
          </bean>
        </property>
      </bean>
    </property>

    <property name="cacheConfiguration">
      <list>
        <bean id="zone-aware-cache-template" abstract="true" class="org.apache.ignite.configuration.CacheConfiguration">
          <!-- when you create a template via XML configuration, you must add an asterisk to the name of the template -->
          <property name="name" value="zone-aware-cache*"/>
          <property name="cacheMode" value="PARTITIONED"/>
          <property name="atomicityMode" value="ATOMIC"/>
          <property name="backups" value="1"/>
          <property name="readFromBackup" value="true"/>
          <property name="partitionLossPolicy" value="READ_WRITE_SAFE"/>
          <property name="copyOnRead" value="true"/>
          <property name="eagerTtl" value="true"/>
          <property name="statisticsEnabled" value="true"/>
          <property name="affinity">
            <bean class="org.apache.ignite.cache.affinity.rendezvous.RendezvousAffinityFunction">
              <property name="partitions" value="2"/>  <!-- for debugging only! -->
              <property name="excludeNeighbors" value="true"/>
              <property name="affinityBackupFilter">
                <bean class="org.apache.ignite.cache.affinity.rendezvous.ClusterNodeAttributeAffinityBackupFilter">
                  <constructor-arg>
                    <array value-type="java.lang.String">
                      <!-- Backups must go to different AZs -->
                      <value>AVAILABILITY_ZONE</value>
                    </array>
                  </constructor-arg>
                </bean>
              </property>
            </bean>
          </property>
        </bean>
      </list>
    </property>

  </bean>
</beans>

Update: Eventually excludeNeighbors false/true makes or breaks zone awareness. I'm not sure why it didn't work with excludeNeighbors=false previously for me. I made some scripts to automate my testing. And now it's definite that it's the excludeNeighbors setting. It's all here: https://github.com/doitintl/ignite-gke. Regardless I also opened a bug with IGNITE Jira: https://issues.apache.org/jira/browse/IGNITE-12896. Many thanks to @alamar for his suggestions.

1

1 Answers

2
votes

I recommend setting excludeNeighbors to false. It is true in your case, it is not needed, and I get correct partitions mapping when I set it to false (of course, I also run all four nodes locally).

Environment property was enough, did not need to add it manually to user attributes.