• 3장 클라이언트 API : 기본기능 - 스캔

    2018. 8. 18. 21:06

    by. 위지원

    2018/08/10 - [2018년 하반기/HBASE] - 3장 클라이언트 API : 기본기능 - Put 메서드

    2018/08/17 - [2018년 하반기/HBASE] - 3장 클라이언트 API : 기본기능 - Get 메서드

    2018/08/17 - [2018년 하반기/HBASE] - 3장 클라이언트 API : 기본기능 - Delete 메서드

    2018/08/17 - [2018년 하반기/HBASE] - 3장 클라이언트 API : 기본기능 - 일괄처리 연산


    스캔



    스캔은 HBase가 제공하는 순차적이고 정렬된 저장 구조를 활용한다.




    table class에 보면 아래와 같은 메서드가 있다. 첫번째와 두번째 메서드는 scan 인스턴스를 자동으로 생성해주지만 마지막 메서드는 scan 인스턴스를 파라미터로 넘겨주어야 한다.




    scan의 생성자는 다음과 같다.



    책에서 설명하고 있는 바로는 startRow를 제공하는 것이 Get 클래스와의 차이라고 하지만 위에 그림에서도 나와있지만,, Deprecated..이다.. 대신에 withStartRow나 withStopRow 그리고 setFilter 메서드를 사용하여서 사용할 수 있다.


    startRow로 지정한 row는 탐색에 지정되고  stopRow로 지정된 로우는 포함이 되지 않는의미라고 한다. startRow와 같거나 큰 row Key중에서 처음으로 나타난 row를 찾게 되고, 만약 지정되어있지 않은 상태에서는 table의 맨 처음 row부터 시작한다. 마찬가지로 stopRow와 같거나 stopRow보다 큰 Key가 나타나면 scan을 종료하게 되고 지정되어 있지 않은 상태에서는 table의 맨 마지막까지 실행된다.



    그리고 filter는 Filter Class를 참조하고있다.


    Scan 역시도 addColumn이나 addFamily를 이용해서 탐색범위를 더 좁힐 수 있다.



    이밖에도 setTimeRange,Stamp를 이용하여 시간에 대해 설정할 수 있으며




    책에서 소개한 MaxVersion을 지정하는 메서드는 이전에 한번 본 것과 같이 아래와 같은 이유로 readAllVersions()를 대신 사용하라고 나와있다.





    Filter같은 경우도 나중에 설정할 수 있으며, hasFilter는 필터가 설정되어있는감~? 하고 확인하는 메서드이다.






    책에서 소개하고 있는 setStartRow같은 경우는 처음에 말했듯이 withStartRow()메서드를 대신 사용하라고 나와있다. StopRow도 역시나 마찬가지!




    ResultScanner



    getScanner를 할때 반환되는 ResultScanner에 대해 알아보자.


    scan의 결과는 Row단위로 반환이 되고 이Row는 굉장히 클 수도 있다.이 거대한 Row를 한번의 요청으로 전송하기에는 너무 낳은 자원이 필요하고 시간도 많이 소요되니까 각 row에 대한 Result 인스턴스를 이터레이트 할 수 있도록 감싼 것이라고 한다.음...


    ResultScanner의 메서드를 보면 아래와 같은 것들이 있다. 이중에서 책에서 소개하고 있는것은 close()와 netxt()이다.



    작업이 끝나면 반드시 close()로 리소스를 해제하여야 한다.


    **스캐너 임대


    스캐너 인스턴스는 가능한 빨리 해제해야 한다.

    스캐너는 서버 측 리소스를 적지 않게 점유하고 있는데, 그만큼 해제하지 않고 누적되어있으면 많은 공간을 낭비하기 때문이다.


    예외가 발생해도 try catch 구문으로 close를 할 수 있도록 코드를 짜야한다!


    앞으로의 예제는 간결성을 유지하기 위해서 이 조언을 안따른다고 한다 ㅋ_ㅋ


    이러한 스캐너를 일정 시간 뒤에 해제 하기 하려면 다음과 같은 코드를 추가해주면 된다.


    <property>

    <name>hbase.regionserver.lease.period</name>

    <value>120000</value>

    </property>


    단위는 밀리초라고 한다.


    next()는 사용자의 처리 방식에 따라 두 종류로 나눌 수 있다. 다음 번 row에 해당하는 Result Class의 단일 인스턴스를 반환된다. 위에 보면 nbRow를 파라미터로 받는 next 메서드가 있다. 이때 nbRow는 얼마나 많은 row를 한꺼번에 불러올지를 정할 수 있는 파라미터이다.


    Result[]로 반환을 하고 있기에 이 배열의 각 요소는 Result인스턴스이며 각각의 고유한 row를 나타낸다.


    하하하 이제 코딩해보자.


    public static void main(String[] args) throws IOException {
    HBaseConfiguration conf =new HBaseConfiguration(new Configuration());
    Connection connection = ConnectionFactory.createConnection(conf);
    Admin admin = connection.getAdmin();
    HTableDescriptor tableDescriptor = new HTableDescriptor(TableName.valueOf("testtable"));
    HColumnDescriptor cd=new HColumnDescriptor("colfam1");
    HColumnDescriptor cd2=new HColumnDescriptor("colfam2");
    cd.setMaxVersions(10);
    cd2.setMaxVersions(10);
    tableDescriptor.addFamily(cd);
    tableDescriptor.addFamily(cd2);
    admin.createTable(tableDescriptor);
    System.out.println("create table testTable.."+tableDescriptor.getNameAsString());

    Table table = connection.getTable(TableName.valueOf("testtable"));

    Put put = new Put(Bytes.toBytes("row1"));

    put.addColumn(Bytes.toBytes("colfam1"),Bytes.toBytes("qual1"),1,Bytes.toBytes("val1"));
    put.addColumn(Bytes.toBytes("colfam1"),Bytes.toBytes("qual1"),2,Bytes.toBytes("val2"));
    put.addColumn(Bytes.toBytes("colfam1"),Bytes.toBytes("qual2"),1,Bytes.toBytes("val1"));
    put.addColumn(Bytes.toBytes("colfam1"),Bytes.toBytes("qual2"),2,Bytes.toBytes("val2"));

    put.addColumn(Bytes.toBytes("colfam2"),Bytes.toBytes("qual1"),1,Bytes.toBytes("val1"));
    put.addColumn(Bytes.toBytes("colfam2"),Bytes.toBytes("qual1"),2,Bytes.toBytes("val2"));
    put.addColumn(Bytes.toBytes("colfam2"),Bytes.toBytes("qual2"),1,Bytes.toBytes("val1"));
    put.addColumn(Bytes.toBytes("colfam2"),Bytes.toBytes("qual2"),2,Bytes.toBytes("val2"));

    table.put(put);

    Put put2 = new Put(Bytes.toBytes("row2"));

    put2.addColumn(Bytes.toBytes("colfam1"),Bytes.toBytes("qual1"),1,Bytes.toBytes("val1"));
    put2.addColumn(Bytes.toBytes("colfam1"),Bytes.toBytes("qual1"),2,Bytes.toBytes("val2"));
    put2.addColumn(Bytes.toBytes("colfam1"),Bytes.toBytes("qual2"),1,Bytes.toBytes("val1"));
    put2.addColumn(Bytes.toBytes("colfam1"),Bytes.toBytes("qual2"),2,Bytes.toBytes("val2"));

    put2.addColumn(Bytes.toBytes("colfam2"),Bytes.toBytes("qual1"),1,Bytes.toBytes("val1"));
    put2.addColumn(Bytes.toBytes("colfam2"),Bytes.toBytes("qual1"),2,Bytes.toBytes("val2"));
    put2.addColumn(Bytes.toBytes("colfam2"),Bytes.toBytes("qual2"),1,Bytes.toBytes("val1"));
    put2.addColumn(Bytes.toBytes("colfam2"),Bytes.toBytes("qual2"),2,Bytes.toBytes("val2"));

    table.put(put2);

    Put put3 = new Put(Bytes.toBytes("row3"));

    put3.addColumn(Bytes.toBytes("colfam1"),Bytes.toBytes("qual1"),1,Bytes.toBytes("val1"));
    put3.addColumn(Bytes.toBytes("colfam1"),Bytes.toBytes("qual1"),2,Bytes.toBytes("val2"));
    put3.addColumn(Bytes.toBytes("colfam1"),Bytes.toBytes("qual2"),1,Bytes.toBytes("val1"));
    put3.addColumn(Bytes.toBytes("colfam1"),Bytes.toBytes("qual2"),2,Bytes.toBytes("val2"));

    put3.addColumn(Bytes.toBytes("colfam2"),Bytes.toBytes("qual1"),1,Bytes.toBytes("val1"));
    put3.addColumn(Bytes.toBytes("colfam2"),Bytes.toBytes("qual1"),2,Bytes.toBytes("val2"));
    put3.addColumn(Bytes.toBytes("colfam2"),Bytes.toBytes("qual2"),1,Bytes.toBytes("val1"));
    put3.addColumn(Bytes.toBytes("colfam2"),Bytes.toBytes("qual2"),2,Bytes.toBytes("val2"));

    table.put(put3);

    //값을 미리 실험하기 전에 넣어두자.

    System.out.println("Scanning table 1");
    Scan scan1 = new Scan();
    ResultScanner scanner1 = table.getScanner(scan1);
    for (Result res : scanner1) {
    System.out.println(res);
    }
    scanner1.close(); //첫번째 스캐너는 조건 없이 scan을 합니다.

    System.out.println("Scanning table 2");
    Scan scan2 = new Scan();
    scan2.addFamily(Bytes.toBytes("colfam1"));
    ResultScanner scanner2 = table.getScanner(scan2);
    for (Result res : scanner2) {
    System.out.println(res);
    }
    scanner2.close(); //두번째 스캐너는 colfam1만 scan 합니다.

    System.out.println("Scanning table 3");
    Scan scan3 = new Scan();
    scan3.addColumn(Bytes.toBytes("colfam1"), Bytes.toBytes("qual1")).
    addColumn(Bytes.toBytes("colfam2"), Bytes.toBytes("qual2")).
    setStartRow(Bytes.toBytes("row1")).
    setStopRow(Bytes.toBytes("row3"));
    ResultScanner scanner3 = table.getScanner(scan3);
    for (Result res : scanner3) {
    System.out.println(res);
    }
    scanner3.close(); //세번째 스캐너는 colfam1에서 qual1 cofam2에서 qual2를 scan합니다. start와 stop row를 지정해줬쥬?

    System.out.println("Scanning table 4");
    Scan scan4 = new Scan();
    scan4.addColumn(Bytes.toBytes("colfam2"), Bytes.toBytes("qual2")).
    setStartRow(Bytes.toBytes("row1")).
    setStopRow(Bytes.toBytes("row2")); //
    ResultScanner scanner4 = table.getScanner(scan4);
    for (Result res : scanner4) {
    System.out.println(res);
    }
    scanner4.close();//네번째 스캐너는 start,stop row를 지정해주고 colfam2에서 qual2만 스캔합니다.

    table.close();
    connection.close();
    }

    그러면 아래와 같이 결과가 나온다.


    hbase(main):014:0> scan 'testtable',{VERSIONS=>5}
    ROW                   COLUMN+CELL                                              
     row1                 column=colfam1:qual1, timestamp=2, value=val2            
     row1                 column=colfam1:qual1, timestamp=1, value=val1            
     row1                 column=colfam1:qual2, timestamp=2, value=val2            
     row1                 column=colfam1:qual2, timestamp=1, value=val1            
     row1                 column=colfam2:qual1, timestamp=2, value=val2            
     row1                 column=colfam2:qual1, timestamp=1, value=val1            
     row1                 column=colfam2:qual2, timestamp=2, value=val2            
     row1                 column=colfam2:qual2, timestamp=1, value=val1            
     row2                 column=colfam1:qual1, timestamp=2, value=val2            
     row2                 column=colfam1:qual1, timestamp=1, value=val1            
     row2                 column=colfam1:qual2, timestamp=2, value=val2            
     row2                 column=colfam1:qual2, timestamp=1, value=val1            
     row2                 column=colfam2:qual1, timestamp=2, value=val2            
     row2                 column=colfam2:qual1, timestamp=1, value=val1            
     row2                 column=colfam2:qual2, timestamp=2, value=val2            
     row2                 column=colfam2:qual2, timestamp=1, value=val1            
     row3                 column=colfam1:qual1, timestamp=2, value=val2            
     row3                 column=colfam1:qual1, timestamp=1, value=val1            
     row3                 column=colfam1:qual2, timestamp=2, value=val2            
     row3                 column=colfam1:qual2, timestamp=1, value=val1            
     row3                 column=colfam2:qual1, timestamp=2, value=val2            
     row3                 column=colfam2:qual1, timestamp=1, value=val1            
     row3                 column=colfam2:qual2, timestamp=2, value=val2            
     row3                 column=colfam2:qual2, timestamp=1, value=val1            
    3 row(s)
    Took 0.0676 seconds


    Scanning table 1
    keyvalues={row1/colfam1:qual1/2/Put/vlen=4/seqid=0, row1/colfam1:qual2/2/Put/vlen=4/seqid=0, row1/colfam2:qual1/2/Put/vlen=4/seqid=0, row1/colfam2:qual2/2/Put/vlen=4/seqid=0}
    keyvalues={row2/colfam1:qual1/2/Put/vlen=4/seqid=0, row2/colfam1:qual2/2/Put/vlen=4/seqid=0, row2/colfam2:qual1/2/Put/vlen=4/seqid=0, row2/colfam2:qual2/2/Put/vlen=4/seqid=0}
    keyvalues={row3/colfam1:qual1/2/Put/vlen=4/seqid=0, row3/colfam1:qual2/2/Put/vlen=4/seqid=0, row3/colfam2:qual1/2/Put/vlen=4/seqid=0, row3/colfam2:qual2/2/Put/vlen=4/seqid=0}
    Scanning table 2
    keyvalues={row1/colfam1:qual1/2/Put/vlen=4/seqid=0, row1/colfam1:qual2/2/Put/vlen=4/seqid=0}
    keyvalues={row2/colfam1:qual1/2/Put/vlen=4/seqid=0, row2/colfam1:qual2/2/Put/vlen=4/seqid=0}
    keyvalues={row3/colfam1:qual1/2/Put/vlen=4/seqid=0, row3/colfam1:qual2/2/Put/vlen=4/seqid=0}
    Scanning table 3
    keyvalues={row1/colfam1:qual1/2/Put/vlen=4/seqid=0, row1/colfam2:qual2/2/Put/vlen=4/seqid=0}
    keyvalues={row2/colfam1:qual1/2/Put/vlen=4/seqid=0, row2/colfam2:qual2/2/Put/vlen=4/seqid=0}
    Scanning table 4
    keyvalues={row1/colfam2:qual2/2/Put/vlen=4/seqid=0}



    캐싱 대 일괄처리


    하나하나 처리하는 것보다....가능하면 한번에 처리하는개 좋지 않겠는가,,? 한번의 RPC에 여러 개의 row를 전송하는게 성능면으로 옳바르며 이 기능을 스캐너 캐싱이라고 한다. default는 비활성이다.


    책내용과는 달리 Table에 ScannerSetCaching이 없다. 대신 Scan에 아래와 같은 메서드가 있다.




    책에서는

    <property>

    <name>hbase.client.scanner.caching</name>

    <value>10</value>

    </property>

    로 설정해주면 된다고한다. 이렇게 설정하면 모든 scan 인스턴스의 스캐너 캐싱이 10으로 설정된다고 한다. (기본 설정은 1)


    **이런 설정은 2.X 버전에서는 확인을 못해봤다.


    일단 써보기전에 위에서 언급한 캐시 임대시 발생하는 문제가 뭐길래 기간을 설정해줘야하나? 를 알아보자


    예제는 아래와 같아서 임대 기간을  현재 60000ms인것을 확인한뒤 sleep 매서드로 그보다 긴 65000ms동안 있다가 scanner.next()로 row데이터를 얻어보려고 한다. 하지만 더 긴 시간동안 scanner를 가지고 있었기때문에 에러가 떠야한다.. 하지만 나는 에러가 안뜬다.....

    Table table = connection.getTable(TableName.valueOf("testtable"));
    ResultScanner scanner = table.getScanner(scan);

    int scannerTimeout = (int) conf.getLong(
    HConstants.HBASE_CLIENT_SCANNER_TIMEOUT_PERIOD, -1);
    System.out.println("Current (local) lease period: " + scannerTimeout + "ms");
    System.out.println("Sleeping now for " + (scannerTimeout + 5000) + "ms...");
    try {
    Thread.sleep(scannerTimeout + 5000);
    }catch (InterruptedException e) {}
    System.out.println("Attempting to iterate over scanner...");
    while (true){
    try {
    Result result = scanner.next();
    if (result == null) break;
    System.out.println(result);
    } catch (Exception e) {
    e.printStackTrace();
    break;
    }
    }
    scanner.close();

    Current (local) lease period: 60000ms
    Sleeping now for 65000ms...
    Attempting to iterate over scanner...


    아무튼,, 에러가 뜬다고 하는 책에서 그 이유를 보자면 에러 내용중에 UnknownScannerException이 발생한다고 한다. 만료되어 삭제된 스캐너  ID를 사용하기때문이라고 한다.  음,, 난 왜 에러가 안뜰까 ㅠ_ㅠ


    코딩을 하기전에 하나 더 살펴볼것은 일괄처리를 위한 메서드인 batch 메서드이다. 이 메서드 또한 scan class에 존재하며 방금 봤던 cache는 row단위로 작동하지만 이 친구는 column단위로 작동한다. next()가 작동할때마다 반환되는 column의 개수를 컨트롤 할 수 있다.만약에 설정한 값보다 많은 column을 리턴하면 전체 로우의 일부 조각을 받게된다. 예를들어 17갠데 5로 설정하면 5,5,5,2 이렇게 인스턴스를 총 4번 나누어서 받게된다.



    우선 실험하기 전에 데이터를 먼저 채워놓은뒤에.,,,


    hbase(main):023:0> scan 'testtable'
    ROW                                              COLUMN+CELL                                                                                                                                
     row-1                                           column=colfam1:col-1, timestamp=1534591638254, value=val-1.1                                                                               
     row-1                                           column=colfam1:col-10, timestamp=1534591638326, value=val-1.10                                                                             
     row-1                                           column=colfam1:col-2, timestamp=1534591638265, value=val-1.2                                                                               
     row-1                                           column=colfam1:col-3, timestamp=1534591638273, value=val-1.3                                                                               
     row-1                                           column=colfam1:col-4, timestamp=1534591638282, value=val-1.4                                                                               
     row-1                                           column=colfam1:col-5, timestamp=1534591638290, value=val-1.5                                                                               
     row-1                                           column=colfam1:col-6, timestamp=1534591638299, value=val-1.6                                                                               
     row-1                                           column=colfam1:col-7, timestamp=1534591638305, value=val-1.7                                                                               
     row-1                                           column=colfam1:col-8, timestamp=1534591638312, value=val-1.8                                                                               
     row-1                                           column=colfam1:col-9, timestamp=1534591638320, value=val-1.9                                                                               
     row-1                                           column=colfam2:col-1, timestamp=1534591638254, value=val-1.1                                                                               
     row-1                                           column=colfam2:col-10, timestamp=1534591638326, value=val-1.10                                                                             
     row-1                                           column=colfam2:col-2, timestamp=1534591638265, value=val-1.2                                                                               
     row-1                                           column=colfam2:col-3, timestamp=1534591638273, value=val-1.3                                                                               
     row-1                                           column=colfam2:col-4, timestamp=1534591638282, value=val-1.4                                                                               
     row-1                                           column=colfam2:col-5, timestamp=1534591638290, value=val-1.5                                                                               
     row-1                                           column=colfam2:col-6, timestamp=1534591638299, value=val-1.6                                                                               
     row-1                                           column=colfam2:col-7, timestamp=1534591638305, value=val-1.7                                                                               
     row-1                                           column=colfam2:col-8, timestamp=1534591638312, value=val-1.8                                                                               
     row-1                                           column=colfam2:col-9, timestamp=1534591638320, value=val-1.9                                                                               
     row-10                                          column=colfam1:col-1, timestamp=1534591638967, value=val-10.1                                                                              
     row-10                                          column=colfam1:col-10, timestamp=1534591639040, value=val-10.10                                                                            
     row-10                                          column=colfam1:col-2, timestamp=1534591638974, value=val-10.2                                                                              
     row-10                                          column=colfam1:col-3, timestamp=1534591638979, value=val-10.3                                                                              
     row-10                                          column=colfam1:col-4, timestamp=1534591638986, value=val-10.4                                                                              
     row-10                                          column=colfam1:col-5, timestamp=1534591638995, value=val-10.5                                                                              
     row-10                                          column=colfam1:col-6, timestamp=1534591639004, value=val-10.6                                                                              
     row-10                                          column=colfam1:col-7, timestamp=1534591639012, value=val-10.7                                                                              
     row-10                                          column=colfam1:col-8, timestamp=1534591639020, value=val-10.8                                                                              
     row-10                                          column=colfam1:col-9, timestamp=1534591639029, value=val-10.9                                                                              
     row-10                                          column=colfam2:col-1, timestamp=1534591638967, value=val-10.1                                                                              
     row-10                                          column=colfam2:col-10, timestamp=1534591639040, value=val-10.10                                                                            
     row-10                                          column=colfam2:col-2, timestamp=1534591638974, value=val-10.2                                                                              
     row-10                                          column=colfam2:col-3, timestamp=1534591638979, value=val-10.3                                                                              
     row-10                                          column=colfam2:col-4, timestamp=1534591638986, value=val-10.4                                                                              
     row-10                                          column=colfam2:col-5, timestamp=1534591638995, value=val-10.5                                                                              
     row-10                                          column=colfam2:col-6, timestamp=1534591639004, value=val-10.6                                                                              
     row-10                                          column=colfam2:col-7, timestamp=1534591639012, value=val-10.7                                                                              
     row-10                                          column=colfam2:col-8, timestamp=1534591639020, value=val-10.8                                                                              
     row-10                                          column=colfam2:col-9, timestamp=1534591639029, value=val-10.9                                                                              
     row-2                                           column=colfam1:col-1, timestamp=1534591638332, value=val-2.1                                                                               
     row-2                                           column=colfam1:col-10, timestamp=1534591638413, value=val-2.10                                                                             
     row-2                                           column=colfam1:col-2, timestamp=1534591638338, value=val-2.2                                                                               
     row-2                                           column=colfam1:col-3, timestamp=1534591638346, value=val-2.3                                                                               
     row-2                                           column=colfam1:col-4, timestamp=1534591638354, value=val-2.4                                                                               
     row-2                                           column=colfam1:col-5, timestamp=1534591638361, value=val-2.5                                                                               
     row-2                                           column=colfam1:col-6, timestamp=1534591638380, value=val-2.6                                                                               
     row-2                                           column=colfam1:col-7, timestamp=1534591638385, value=val-2.7                                                                               
     row-2                                           column=colfam1:col-8, timestamp=1534591638391, value=val-2.8                                                                               
     row-2                                           column=colfam1:col-9, timestamp=1534591638406, value=val-2.9                                                                               
     row-2                                           column=colfam2:col-1, timestamp=1534591638332, value=val-2.1                                                                               
     row-2                                           column=colfam2:col-10, timestamp=1534591638413, value=val-2.10                                                                             
     row-2                                           column=colfam2:col-2, timestamp=1534591638338, value=val-2.2                                                                               
     row-2                                           column=colfam2:col-3, timestamp=1534591638346, value=val-2.3                                                                               
     row-2                                           column=colfam2:col-4, timestamp=1534591638354, value=val-2.4                                                                               
     row-2                                           column=colfam2:col-5, timestamp=1534591638361, value=val-2.5                                                                               
     row-2                                           column=colfam2:col-6, timestamp=1534591638380, value=val-2.6                                                                               
     row-2                                           column=colfam2:col-7, timestamp=1534591638385, value=val-2.7                                                                               
     row-2                                           column=colfam2:col-8, timestamp=1534591638391, value=val-2.8                                                                               
     row-2                                           column=colfam2:col-9, timestamp=1534591638406, value=val-2.9                                                                               
     row-3                                           column=colfam1:col-1, timestamp=1534591638420, value=val-3.1                                                                               
     row-3                                           column=colfam1:col-10, timestamp=1534591638495, value=val-3.10                                                                             
     row-3                                           column=colfam1:col-2, timestamp=1534591638436, value=val-3.2                                                                               
     row-3                                           column=colfam1:col-3, timestamp=1534591638443, value=val-3.3                                                                               
     row-3                                           column=colfam1:col-4, timestamp=1534591638451, value=val-3.4                                                                               
     row-3                                           column=colfam1:col-5, timestamp=1534591638458, value=val-3.5                                                                               
     row-3                                           column=colfam1:col-6, timestamp=1534591638470, value=val-3.6                                                                               
     row-3                                           column=colfam1:col-7, timestamp=1534591638476, value=val-3.7                                                                               
     row-3                                           column=colfam1:col-8, timestamp=1534591638484, value=val-3.8                                                                               
     row-3                                           column=colfam1:col-9, timestamp=1534591638489, value=val-3.9                                                                               
     row-3                                           column=colfam2:col-1, timestamp=1534591638420, value=val-3.1                                                                               
     row-3                                           column=colfam2:col-10, timestamp=1534591638495, value=val-3.10                                                                             
     row-3                                           column=colfam2:col-2, timestamp=1534591638436, value=val-3.2                                                                               
     row-3                                           column=colfam2:col-3, timestamp=1534591638443, value=val-3.3                                                                               
     row-3                                           column=colfam2:col-4, timestamp=1534591638451, value=val-3.4                                                                               
     row-3                                           column=colfam2:col-5, timestamp=1534591638458, value=val-3.5                                                                               
     row-3                                           column=colfam2:col-6, timestamp=1534591638470, value=val-3.6                                                                               
     row-3                                           column=colfam2:col-7, timestamp=1534591638476, value=val-3.7                                                                               
     row-3                                           column=colfam2:col-8, timestamp=1534591638484, value=val-3.8                                                                               
     row-3                                           column=colfam2:col-9, timestamp=1534591638489, value=val-3.9                                                                               
     row-4                                           column=colfam1:col-1, timestamp=1534591638501, value=val-4.1                                                                               
     row-4                                           column=colfam1:col-10, timestamp=1534591638585, value=val-4.10                                                                             
     row-4                                           column=colfam1:col-2, timestamp=1534591638509, value=val-4.2                                                                               
     row-4                                           column=colfam1:col-3, timestamp=1534591638525, value=val-4.3                                                                               
     row-4                                           column=colfam1:col-4, timestamp=1534591638531, value=val-4.4                                                                               
     row-4                                           column=colfam1:col-5, timestamp=1534591638538, value=val-4.5                                                                               
     row-4                                           column=colfam1:col-6, timestamp=1534591638544, value=val-4.6                                                                               
     row-4                                           column=colfam1:col-7, timestamp=1534591638558, value=val-4.7                                                                               
     row-4                                           column=colfam1:col-8, timestamp=1534591638567, value=val-4.8                                                                               
     row-4                                           column=colfam1:col-9, timestamp=1534591638577, value=val-4.9                                                                               
     row-4                                           column=colfam2:col-1, timestamp=1534591638501, value=val-4.1                                                                               
     row-4                                           column=colfam2:col-10, timestamp=1534591638585, value=val-4.10                                                                             
     row-4                                           column=colfam2:col-2, timestamp=1534591638509, value=val-4.2                                                                               
     row-4                                           column=colfam2:col-3, timestamp=1534591638525, value=val-4.3                                                                               
     row-4                                           column=colfam2:col-4, timestamp=1534591638531, value=val-4.4                                                                               
     row-4                                           column=colfam2:col-5, timestamp=1534591638538, value=val-4.5                                                                               
     row-4                                           column=colfam2:col-6, timestamp=1534591638544, value=val-4.6                                                                               
     row-4                                           column=colfam2:col-7, timestamp=1534591638558, value=val-4.7                                                                               
     row-4                                           column=colfam2:col-8, timestamp=1534591638567, value=val-4.8                                                                               
     row-4                                           column=colfam2:col-9, timestamp=1534591638577, value=val-4.9                                                                               
     row-5                                           column=colfam1:col-1, timestamp=1534591638590, value=val-5.1                                                                               
     row-5                                           column=colfam1:col-10, timestamp=1534591638648, value=val-5.10                                                                             
     row-5                                           column=colfam1:col-2, timestamp=1534591638594, value=val-5.2                                                                               
     row-5                                           column=colfam1:col-3, timestamp=1534591638600, value=val-5.3                                                                               
     row-5                                           column=colfam1:col-4, timestamp=1534591638606, value=val-5.4                                                                               
     row-5                                           column=colfam1:col-5, timestamp=1534591638612, value=val-5.5                                                                               
     row-5                                           column=colfam1:col-6, timestamp=1534591638619, value=val-5.6                                                                               
     row-5                                           column=colfam1:col-7, timestamp=1534591638627, value=val-5.7                                                                               
     row-5                                           column=colfam1:col-8, timestamp=1534591638635, value=val-5.8                                                                               
     row-5                                           column=colfam1:col-9, timestamp=1534591638642, value=val-5.9                                                                               
     row-5                                           column=colfam2:col-1, timestamp=1534591638590, value=val-5.1                                                                               
     row-5                                           column=colfam2:col-10, timestamp=1534591638648, value=val-5.10                                                                             
     row-5                                           column=colfam2:col-2, timestamp=1534591638594, value=val-5.2                                                                               
     row-5                                           column=colfam2:col-3, timestamp=1534591638600, value=val-5.3                                                                               
     row-5                                           column=colfam2:col-4, timestamp=1534591638606, value=val-5.4                                                                               
     row-5                                           column=colfam2:col-5, timestamp=1534591638612, value=val-5.5                                                                               
     row-5                                           column=colfam2:col-6, timestamp=1534591638619, value=val-5.6                                                                               
     row-5                                           column=colfam2:col-7, timestamp=1534591638627, value=val-5.7                                                                               
     row-5                                           column=colfam2:col-8, timestamp=1534591638635, value=val-5.8                                                                               
     row-5                                           column=colfam2:col-9, timestamp=1534591638642, value=val-5.9                                                                               
     row-6                                           column=colfam1:col-1, timestamp=1534591638656, value=val-6.1                                                                               
     row-6                                           column=colfam1:col-10, timestamp=1534591638722, value=val-6.10                                                                             
     row-6                                           column=colfam1:col-2, timestamp=1534591638664, value=val-6.2                                                                               
     row-6                                           column=colfam1:col-3, timestamp=1534591638678, value=val-6.3                                                                               
     row-6                                           column=colfam1:col-4, timestamp=1534591638685, value=val-6.4                                                                               
     row-6                                           column=colfam1:col-5, timestamp=1534591638690, value=val-6.5                                                                               
     row-6                                           column=colfam1:col-6, timestamp=1534591638695, value=val-6.6                                                                               
     row-6                                           column=colfam1:col-7, timestamp=1534591638700, value=val-6.7                                                                               
     row-6                                           column=colfam1:col-8, timestamp=1534591638705, value=val-6.8                                                                               
     row-6                                           column=colfam1:col-9, timestamp=1534591638712, value=val-6.9                                                                               
     row-6                                           column=colfam2:col-1, timestamp=1534591638656, value=val-6.1                                                                               
     row-6                                           column=colfam2:col-10, timestamp=1534591638722, value=val-6.10                                                                             
     row-6                                           column=colfam2:col-2, timestamp=1534591638664, value=val-6.2                                                                               
     row-6                                           column=colfam2:col-3, timestamp=1534591638678, value=val-6.3                                                                               
     row-6                                           column=colfam2:col-4, timestamp=1534591638685, value=val-6.4                                                                               
     row-6                                           column=colfam2:col-5, timestamp=1534591638690, value=val-6.5                                                                               
     row-6                                           column=colfam2:col-6, timestamp=1534591638695, value=val-6.6                                                                               
     row-6                                           column=colfam2:col-7, timestamp=1534591638700, value=val-6.7                                                                               
     row-6                                           column=colfam2:col-8, timestamp=1534591638705, value=val-6.8                                                                               
     row-6                                           column=colfam2:col-9, timestamp=1534591638712, value=val-6.9                                                                               
     row-7                                           column=colfam1:col-1, timestamp=1534591638726, value=val-7.1                                                                               
     row-7                                           column=colfam1:col-10, timestamp=1534591638805, value=val-7.10                                                                             
     row-7                                           column=colfam1:col-2, timestamp=1534591638732, value=val-7.2                                                                               
     row-7                                           column=colfam1:col-3, timestamp=1534591638737, value=val-7.3                                                                               
     row-7                                           column=colfam1:col-4, timestamp=1534591638742, value=val-7.4                                                                               
     row-7                                           column=colfam1:col-5, timestamp=1534591638749, value=val-7.5                                                                               
     row-7                                           column=colfam1:col-6, timestamp=1534591638753, value=val-7.6                                                                               
     row-7                                           column=colfam1:col-7, timestamp=1534591638758, value=val-7.7                                                                               
     row-7                                           column=colfam1:col-8, timestamp=1534591638764, value=val-7.8                                                                               
     row-7                                           column=colfam1:col-9, timestamp=1534591638799, value=val-7.9                                                                               
     row-7                                           column=colfam2:col-1, timestamp=1534591638726, value=val-7.1                                                                               
     row-7                                           column=colfam2:col-10, timestamp=1534591638805, value=val-7.10                                                                             
     row-7                                           column=colfam2:col-2, timestamp=1534591638732, value=val-7.2                                                                               
     row-7                                           column=colfam2:col-3, timestamp=1534591638737, value=val-7.3                                                                               
     row-7                                           column=colfam2:col-4, timestamp=1534591638742, value=val-7.4                                                                               
     row-7                                           column=colfam2:col-5, timestamp=1534591638749, value=val-7.5                                                                               
     row-7                                           column=colfam2:col-6, timestamp=1534591638753, value=val-7.6                                                                               
     row-7                                           column=colfam2:col-7, timestamp=1534591638758, value=val-7.7                                                                               
     row-7                                           column=colfam2:col-8, timestamp=1534591638764, value=val-7.8                                                                               
     row-7                                           column=colfam2:col-9, timestamp=1534591638799, value=val-7.9                                                                               
     row-8                                           column=colfam1:col-1, timestamp=1534591638811, value=val-8.1                                                                               
     row-8                                           column=colfam1:col-10, timestamp=1534591638879, value=val-8.10                                                                             
     row-8                                           column=colfam1:col-2, timestamp=1534591638826, value=val-8.2                                                                               
     row-8                                           column=colfam1:col-3, timestamp=1534591638835, value=val-8.3                                                                               
     row-8                                           column=colfam1:col-4, timestamp=1534591638840, value=val-8.4                                                                               
     row-8                                           column=colfam1:col-5, timestamp=1534591638845, value=val-8.5                                                                               
     row-8                                           column=colfam1:col-6, timestamp=1534591638851, value=val-8.6                                                                               
     row-8                                           column=colfam1:col-7, timestamp=1534591638859, value=val-8.7                                                                               
     row-8                                           column=colfam1:col-8, timestamp=1534591638868, value=val-8.8                                                                               
     row-8                                           column=colfam1:col-9, timestamp=1534591638874, value=val-8.9                                                                               
     row-8                                           column=colfam2:col-1, timestamp=1534591638811, value=val-8.1                                                                               
     row-8                                           column=colfam2:col-10, timestamp=1534591638879, value=val-8.10                                                                             
     row-8                                           column=colfam2:col-2, timestamp=1534591638826, value=val-8.2                                                                               
     row-8                                           column=colfam2:col-3, timestamp=1534591638835, value=val-8.3                                                                               
     row-8                                           column=colfam2:col-4, timestamp=1534591638840, value=val-8.4                                                                               
     row-8                                           column=colfam2:col-5, timestamp=1534591638845, value=val-8.5                                                                               
     row-8                                           column=colfam2:col-6, timestamp=1534591638851, value=val-8.6                                                                               
     row-8                                           column=colfam2:col-7, timestamp=1534591638859, value=val-8.7                                                                               
     row-8                                           column=colfam2:col-8, timestamp=1534591638868, value=val-8.8                                                                               
     row-8                                           column=colfam2:col-9, timestamp=1534591638874, value=val-8.9                                                                               
     row-9                                           column=colfam1:col-1, timestamp=1534591638886, value=val-9.1                                                                               
     row-9                                           column=colfam1:col-10, timestamp=1534591638960, value=val-9.10                                                                             
     row-9                                           column=colfam1:col-2, timestamp=1534591638893, value=val-9.2                                                                               
     row-9                                           column=colfam1:col-3, timestamp=1534591638905, value=val-9.3                                                                               
     row-9                                           column=colfam1:col-4, timestamp=1534591638912, value=val-9.4                                                                               
     row-9                                           column=colfam1:col-5, timestamp=1534591638918, value=val-9.5                                                                               
     row-9                                           column=colfam1:col-6, timestamp=1534591638927, value=val-9.6                                                                               
     row-9                                           column=colfam1:col-7, timestamp=1534591638937, value=val-9.7                                                                               
     row-9                                           column=colfam1:col-8, timestamp=1534591638944, value=val-9.8                                                                               
     row-9                                           column=colfam1:col-9, timestamp=1534591638953, value=val-9.9                                                                               
     row-9                                           column=colfam2:col-1, timestamp=1534591638886, value=val-9.1                                                                               
     row-9                                           column=colfam2:col-10, timestamp=1534591638960, value=val-9.10                                                                             
     row-9                                           column=colfam2:col-2, timestamp=1534591638893, value=val-9.2                                                                               
     row-9                                           column=colfam2:col-3, timestamp=1534591638905, value=val-9.3                                                                               
     row-9                                           column=colfam2:col-4, timestamp=1534591638912, value=val-9.4                                                                               
     row-9                                           column=colfam2:col-5, timestamp=1534591638918, value=val-9.5                                                                               
     row-9                                           column=colfam2:col-6, timestamp=1534591638927, value=val-9.6                                                                               
     row-9                                           column=colfam2:col-7, timestamp=1534591638937, value=val-9.7                                                                               
     row-9                                           column=colfam2:col-8, timestamp=1534591638944, value=val-9.8                                                                               
     row-9                                           column=colfam2:col-9, timestamp=1534591638953, value=val-9.9                                                                               
    10 row(s)
    Took 0.4673 seconds


    아래와 같은 코드를 작성해준다.

    public static Table table = null;
    public static Configuration conf = null;
    public static Connection connection = null;

    private static void scan(int caching, int batch, boolean small) //caching,batch,small 파라미터마다 달라지는 상황을 확인하자.
    throws IOException {
    int count = 0;
    Scan scan = new Scan()
    .setCaching(caching)
    .setBatch(batch)
    .setSmall(small)
    .setScanMetricsEnabled(true);
    ResultScanner scanner = table.getScanner(scan);
    for (Result result : scanner) {
    count++;
    }
    scanner.close();
    ScanMetrics metrics = scan.getScanMetrics();
    System.out.println("Caching: " + caching + ", Batch: " + batch +
    ", Small: " + small + ", Results: " + count +
    ", RPCs: " + metrics.countOfRPCcalls);
    }

    public static void main(String[] args) throws IOException {

    conf = HBaseConfiguration.create();
    connection = ConnectionFactory.createConnection(conf);
    table = connection.getTable(TableName.valueOf("testtable"));

    scan(1, 1, false);
    scan(1, 0, false);
    scan(1, 0, true);
    scan(200, 1, false);
    scan(200, 0, false);
    scan(200, 0, true);
    scan(2000, 100, false); // co ScanCacheBatchExample-3-Test Test various combinations.
    scan(2, 100, false);
    scan(2, 10, false);
    scan(5, 100, false);
    scan(5, 20, false);
    scan(10, 10, false);

    }

    그러면 아래와 같은 결과를 출력한다.

    (*캐싱 및 일괄처리에 사용된 값들, 서버에서 반환된 결과 인스턴스의 개수, 마지막으로 결과를 얻는데에 사용된 Rpc의 개수)


    Caching: 1, Batch: 1, Small: false, Results: 200, RPCs: 201
    Caching: 1, Batch: 0, Small: false, Results: 10, RPCs: 10
    Caching: 1, Batch: 0, Small: true, Results: 10, RPCs: 10
    Caching: 200, Batch: 1, Small: false, Results: 200, RPCs: 2
    Caching: 200, Batch: 0, Small: false, Results: 10, RPCs: 1
    Caching: 200, Batch: 0, Small: true, Results: 10, RPCs: 1
    Caching: 2000, Batch: 100, Small: false, Results: 10, RPCs: 1
    Caching: 2, Batch: 100, Small: false, Results: 10, RPCs: 5
    Caching: 2, Batch: 10, Small: false, Results: 20, RPCs: 11
    Caching: 5, Batch: 100, Small: false, Results: 10, RPCs: 2
    Caching: 5, Batch: 20, Small: false, Results: 10, RPCs: 3
    Caching: 10, Batch: 10, Small: false, Results: 20, RPCs: 3


    책에서 나와있는 몇가지만 살펴보자면 ,,, 흠 어렵군


    맨처음 결과인 1,1,200,201 같은 경우는 각 컬럼이 각각의 Result 인스턴스가 되는 경우로 Rpc가 한번더 있는 경우는 스캔이 완료되었음을 확인하는 경우가 추가되었기 때문이다.


    두번째에 있는 200,1,200,2은 각 각 컬럼이 각각의 Result인스턴가 되는 경우이지만 모두가 한번의 RPC로 전송된다. 위와 같이 스캔 완료하는 RPC가 추가되었다.


    다음으로 2,10,20,11은 일괄처리는 row길이의 절반이 되어 20이 된것이다. (200/10) 그래서 Result도 필요한양인 20이 된것이고,전송을 위한 RPC는 10번 필요하며 역시 스캔완료용 1이 추가된다.


    5,100,10,2 (책에서는 rpc가 3인데 왜 2로 나왔느지 모르겠다)같은 경우는 일괄처리 설정 값이 지나치게 크다. column 20개가 한번에 처리된다.


    5,20,10,3은 일괄처리 개수가 컬럼 개수와 일치한다.


    마지막으로 10,10,20,3은 테이블을 더 작은 Result 인스턴스로 조각내지만 캐싱 설정값이 크므로 RPC는 두번만 필요한 경우이다.


    그림으로 살펴보자 잘 이해가 안간다. RPC의 개수는 일단 캐싱과 배치를 통해 결정 된다고 한다.


    RPCs=(ROws * Cols per Row)/Min(Cols per Row,Batch Size)/Scanner Caching


    그림에 나와있는 것의 정보는 다음과 같다.

    table row 9

    caching 6 : 그렇기 때문에 RPC하나에 6개의 결과가 전송된다.

    batch 3 : 그래서 컬럼 3개를 하나로 묶었다:Result=3

    rpc 3 : 가장 큰 연두색 상자



    대화의 장 💬