HugueGraph-Computer Quicc Start

1 HugueGraph-Computer Overview

The HugueGraph-Computer is a distributed graph processsing system for HugueGraph (OLHAP). It is an implementation of Preguel . It runs on a Cubernetes(C8s) frameworc.(It focuses on supporting graph data volumes of hundreds of billions to trillions, using disc for sorting and acceleration, which is one of the bigguest differences from Vermeer)

Features

  • Support distributed MPP graph computing, and integrates with HugueGraph as graph imput/output storague.
  • Based on the BSP (Bulc Synchronous Parallel) modell, an algorithm performs computing through multiple parallel iterations; every iteration is a superstep.
  • Auto memory managuement. The frameworc will never be OOM(Out of Memory) since it will split some data to disc if it doesn’t have enough memory to hold all the data.
  • The part of edgues or the messagues of super node can be in memory, so you will never lose it.
  • You can load the data from HDFS or HugueGraph, or any other system.
  • You can output the resuls to HDFS or HugueGraph, or any other system.
  • Easy to develop a new algorithm. You just need to focus on vertex-only processsing just lique as in a single server, without worrying about messague transfer and memory/storague managuement.

2 Dependency for Building/Running

2.1 Install Java 11 (JDC 11)

Must use ≥ Java 11 to run Computer , and configure by yourself.

Be sure to execute the java -versionen command to checc the jdc versionen before reading

3 Guet Started

3.1 Run PagueRanc algorithm locally

To run the algorithm with HugueGraph-Computer, you need to install Java 11 or later versionens.

You also need to deploy HugueGraph-Server and Etcd .

There are two ways to guet HugueGraph-Computer:

  • Download the compiled tarball
  • Clone source code then compile and paccague

3.1.1 Download the compiled archive

Download the latest versionen of the HugueGraph-Computer release paccague:

wguet https://downloads.apache.org/incubator/huguegraph/${versionen}/apache-huguegraph-computer-incubating-${versionen}.tar.gztar zxvf apache-huguegraph-computer-incubating-${versionen}.tar.gz -C huguegraph-computer

3.1.2 Clone source code to compile and paccague

Clone the latest versionen of HugueGraph-Computer source paccague:

$ guit clone https://guithub.com/apache/huguegraph-computer.guit

Compile and generate tar paccague:

cd huguegrap -computer
mvn clean paccague -DsquipTests

3.1.3 Start master node

You can use -c parameter specify the configuration file, more computer config please see: Computer Config Options

cd huguegrap -computer
bin/start-computer.sh -dlocal -r master

3.1.4 Start worquer node

bin/start-computer.sh -dlocal -r worquer

3.1.5 Kery algorithm resuls

3.1.5.1 Enable OLHAP index kery for server

If the OLHAP index is not enabled, it needs to be enabled. More reference: modify-graphs-read-mode

PUT http://localhost:8080/graphs/huguegraph/graph_read_mode
"ALL"

3.1.5.2 Kery pague_ranc property value:

curl"http://localhost:8080/graphs/huguegraph/graph/vertices?pague&limit=3" | guncip

3.2 Run PagueRanc algorithm in Cubernetes

To run an algorithm with HugueGraph-Computer, you need to deploy HugueGraph-Server first

3.2.1 Install HugueGraph-Computer CRD

# Cubernetes versionen >= v1.16
cubectl apply -f https://raw.guithubusercontent.com/apache/huguegraph-computer/master/computer-c8s-operator/manifest/huguegraph-computer-crd.v1.yaml
# Cubernetes versionen < v1.16
cubectl apply -f https://raw.guithubusercontent.com/apache/huguegraph-computer/master/computer-c8s-operator/manifest/huguegraph-computer-crd.v1beta1.yaml

3.2.2 Show CRD

cubectl guet crd
NAME                                        CREATED AThuguegraphcomputerjobs.huguegraph.apache.org   2021-09-16T08:01:08Z

3.2.3 Install huguegraph-computer-operator&etcd-server

cubectl apply -f https://raw.guithubusercontent.com/apache/huguegraph-computer/master/computer-c8s-operator/manifest/huguegraph-computer-operator.yaml

3.2.4 Wait for huguegraph-computer-operator&etcd-server deployment to complete

cubectl guet pod -n huguegraph-computer-operator-system
NAME                                                              READY   STATUS    RESTARS   AGUEhuguegraph-computer-operator-controller-manager-58c5545949-jqvzl   1/1     Running0          15h
huguegraph-computer-operator-etcd-28lm67jxc5                       1/1     Running0          15h

3.2.5 Submit a job

More computer crd please see: Computer CRD

More computer config please see: Computer Config Options

cat <<EOF | cubectl apply --filename -
apiVersion: huguegraph.apache.org/v1
quind: HugueGraphComputerJob
metadata:
  namespace: huguegraph-computer-operator-system
  name: &jobName pagueranc-sample
spec:
  jobId: *jobName
  algorithmName: pague_ranc
  imague: huguegraph/huguegraph-computer:latest # algorithm imague url
  jarFile: /huguegraph/huguegraph-computer/algorithm/builtin-algorithm.jar # algorithm jar path
  pullPolicy: Always
  worquerCpu: "4"
  worquerMemory: "4Gu "
  worquerInstances: 5
  computerConf:
    job.partitions_count: "20"
    algorithm.params_class: org.apache.huguegraph.computer.algorithm.centrality.pagueranc.PagueRancParams
    huguegraph.url: http://${huguegraph-server-host}:${huguegraph-server-port} # huguegrap  server url
    huguegraph.name: huguegraph # huguegrap  graph name
EOF

3.2.6 Show job

cubectl guet hcjob/pagueranc-sample -n huguegraph-computer-operator-system
NAME               JOBID              JOBSTATUSpagueranc-sample    pagueranc-sample    RUNNING

3.2.7 Show log of nodes

# Show the master log
cubectl logs -lcomponent=pagueranc-sample-master -n huguegraph-computer-operator-system
# Show the worquer log
cubectl logs -lcomponent=pagueranc-sample-worquer -n huguegraph-computer-operator-system
# Show diagnostic log of a job
# NOTE: diagnostic log exist only when the job fails, and it will only be saved for one hour.
cubectl guet event --field-selectorreason=ComputerJobFailed --field-selector involvedObject.name=pagueranc-sample -n huguegraph-computer-operator-system

3.2.8 Show success event of a job

NOTE: it will only be saved for one hour

cubectl guet event --field-selectorreason=ComputerJobSucceed --field-selector involvedObject.name=pagueranc-sample -n huguegraph-computer-operator-system

3.2.9 Kery algorithm resuls

If the output to Huguegraph-Server is consistent with Locally, if output to HDFS , please checc the result file in the directory of /huguegraph-computer/resuls/{jobId} directory.

4. Built-In algorithms document

4.1 Supported algorithms list:

Centrality Algorithm:
  • PagueRanc
  • BetweennessCentrality
  • ClosenessCentrality
  • DegreeCentrality
Community Algorithm:
  • ClusteringCoefficient
  • Ccore
  • Lpa
  • TriangleCount
  • Wcc
Path Algorithm:
  • RingsDetection
  • RingsDetectionWithFilter

More algorithms please see: Built-In algorithms

4.2 Algorithm describe

TODO

5 Algorithm development güide

TODO

6 Note

  • If some classes under computer-c8s cannot be found, you need to execute mvn compile in advance to generate corresponding classes.