Query not found problem with SciDB 18.1

Hello.
We are doing data loading and redimensioning to SciDB 18.1. At one point we get the query not found error but we cannot identify what causes that problem. Here is the log file content, it repeats multiple times.
Please let me know what could be the possible reasons of this problem.

{code}
Error id: scidb::SCIDB_SE_QPROC::SCIDB_LE_QUERY_NOT_FOUND
Error description: Query processor error. Query 0.1578655327675558077 not found.
2020-01-10 20:22:07.000921 [0x7fed8bee4700] [INFO ]: Executing query(id:0.1578655327891620658, user:scidbadmin, ns:public): insert(redimension(apply(temp_22698to24443, date, 20190715135704, utm_zone, 10, latitude, x, longitude, y, elevation, b1), KOMPSAT_10_10, false), KOMPSAT_10_10) ; from program: IP /opt/scidb/18.1/bin/iquery --host IP -n -q SET LANG AFL; insert(redimension(apply(temp_22698to24443, date, 20190715135704, utm_zone, 10, latitude, x, longitude, y, elevation, b1), KOMPSAT_10_10, false), KOMPSAT_10_10) ;
2020-01-10 20:22:14.000269 [0x7fed8bee4700] [WARN ]: RedimensionCommon::redimensionArray: Data collision is detected at cell position {20190715135704, 10, 0, 0} for attribute ID = 1. Add log4j.logger.scidb.array.RedimensionCommon=TRACE to the log4cxx config file for more
2020-01-10 20:22:21.000677 [0x7fed8cbf1700] [INFO ]: QueryTiming 0 0.1578655327891620658 TOT 13.785478 ACT 13.779093 CPU 6.439395 wPG 0.000000 wFSr 0.000000 wFSw 0.000000 wFSws 0.000000 wFSf 0.000000 wSMld 0.000000 wBFrd 0.000000 wSMcm 0.000000 wREP 0.000000 wNETr 0.000000 wNETs 0.000000 wNETrr 0.000000 wNETrc 0.000000 wSGr 0.000000 wSGb 0.000000 wBAR 0.000000 wJsrt0.000000 wEXT 0.000000 wSEMo 0.000000 wEVo 0.000000 wLTCH 0.000000 wRare 0.000000 wZero 0.000000 OTH% 53.3 insert(redimension(apply(temp_22698to24443, date, 20190715135704, utm_zone, 10, latitude, x, longitude, y, elevation, b1), KOMPSAT_10_10, false), KOMPSAT_10_10)
2020-01-10 20:22:21.000702 [0x7fed8b6dc700] [INFO ]: Executing query(id:0.1578655341687374636, user:scidbadmin, ns:public): remove(temp_22698to24443); from program: 127.0.0.1:43922 /opt/scidb/18.1/bin/shim shim -s 1239 -t /tmp -m 2048 -o 36000 -i 0 ;
2020-01-10 20:22:21.000769 [0x7fed8b6dc700] [INFO ]: QueryTiming 0 0.1578655341687374636 TOT 0.082263 ACT 0.077383 CPU 0.007473 wPG 0.000000 wFSr 0.000000 wFSw 0.000000 wFSws 0.000000 wFSf 0.000000 wSMld 0.000000 wBFrd 0.000000 wSMcm 0.000000 wREP 0.000000 wNETr 0.000000 wNETs 0.000000 wNETrr 0.000000 wNETrc 0.000000 wSGr 0.000000 wSGb 0.000000 wBAR 0.000000 wJsrt0.000000 wEXT 0.000000 wSEMo 0.000000 wEVo 0.000000 wLTCH 0.000000 wRare 0.000000 wZero 0.000000 OTH% 90.3 remove(temp_22698to24443)
2020-01-10 20:22:21.000770 [0x7fed9e896fc0] [ERROR]: Dropping mtCompleteQuery for queryID=0.1578655341687374636, from CLIENT because SystemException in file: src/query/Query.cpp function: getQueryByID line: 1078
Error id: scidb::SCIDB_SE_QPROC::SCIDB_LE_QUERY_NOT_FOUND
Error description: Query processor error. Query 0.1578655341687374636 not found.
2020-01-10 20:22:21.000913 [0x7fed8c2e8700] [INFO ]: Executing query(id:0.1578655341885710324, user:scidbadmin, ns:public): insert(redimension(apply(temp_24444to26189, date, 20190715135704, utm_zone, 10, latitude, x, longitude, y, elevation, b1), KOMPSAT_10_10, false), KOMPSAT_10_10) ; from program: IP /opt/scidb/18.1/bin/iquery --host IP -n -q SET LANG AFL; insert(redimension(apply(temp_24444to26189, date, 20190715135704, utm_zone, 10, latitude, x, longitude, y, elevation, b1), KOMPSAT_10_10, false), KOMPSAT_10_10) ;
2020-01-10 20:22:27.000640 [0x7fed8c2e8700] [WARN ]: RedimensionCommon::redimensionArray: Data collision is detected at cell position {20190715135704, 10, 0, 0} for attribute ID = 1. Add log4j.logger.scidb.array.RedimensionCommon=TRACE to the log4cxx config file for more
2020-01-10 20:22:32.000972 [0x7fed8bfe5700] [INFO ]: QueryTiming 0 0.1578655341885710324 TOT 11.086318 ACT 11.080448 CPU 4.042561 wPG 0.000000 wFSr 0.000000 wFSw 0.000000 wFSws 0.000000 wFSf 0.000000 wSMld 0.000000 wBFrd 0.000000 wSMcm 0.000000 wREP 0.000000 wNETr 0.000000 wNETs 0.000000 wNETrr 0.000000 wNETrc 0.000000 wSGr 0.000000 wSGb 0.000000 wBAR 0.000000 wJsrt0.000000 wEXT 0.000000 wSEMo 0.000000 wEVo 0.000000 wLTCH 0.000000 wRare 0.000000 wZero 0.000000 OTH% 63.5 insert(redimension(apply(temp_24444to26189, date, 20190715135704, utm_zone, 10, latitude, x, longitude, y, elevation, b1), KOMPSAT_10_10, false), KOMPSAT_10_10)
2020-01-10 20:22:32.000998 [0x7fed8dd02700] [INFO ]: Executing query(id:0.1578655352985201930, user:scidbadmin, ns:public): remove(temp_24444to26189); from program: 127.0.0.1:43922 /opt/scidb/18.1/bin/shim shim -s 1239 -t /tmp -m 2048 -o 36000 -i 0 ;
2020-01-10 20:22:33.000062 [0x7fed8dd02700] [INFO ]: QueryTiming 0 0.1578655352985201930 TOT 0.077594 ACT 0.074693 CPU 0.008050 wPG 0.000000 wFSr 0.000000 wFSw 0.000000 wFSws 0.000000 wFSf 0.000000 wSMld 0.000000 wBFrd 0.000000 wSMcm 0.000000 wREP 0.000000 wNETr 0.000000 wNETs 0.000000 wNETrr 0.000000 wNETrc 0.000000 wSGr 0.000000 wSGb 0.000000 wBAR 0.000000 wJsrt0.000000 wEXT 0.000000 wSEMo 0.000000 wEVo 0.000000 wLTCH 0.000000 wRare 0.000000 wZero 0.000000 OTH% 89.2 remove(temp_24444to26189)
2020-01-10 20:22:33.000063 [0x7fed9e896fc0] [ERROR]: Dropping mtCompleteQuery for queryID=0.1578655352985201930, from CLIENT because SystemException in file: src/query/Query.cpp function: getQueryByID line: 1078
Error id: scidb::SCIDB_SE_QPROC::SCIDB_LE_QUERY_NOT_FOUND
Error description: Query processor error. Query 0.1578655352985201930 not found.
2020-01-10 20:22:33.000208 [0x7fed8cff5700] [INFO ]: Executing query(id:0.1578655353183981929, user:scidbadmin, ns:public): insert(redimension(apply(temp_26190to27931, date, 20190715135704, utm_zone, 10, latitude, x, longitude, y, elevation, b1), KOMPSAT_10_10, false), KOMPSAT_10_10) ; from program:IP /opt/scidb/18.1/bin/iquery --host IP -n -q SET LANG AFL; insert(redimension(apply(temp_26190to27931, date, 20190715135704, utm_zone, 10, latitude, x, longitude, y, elevation, b1), KOMPSAT_10_10, false), KOMPSAT_10_10) ;
2020-01-10 20:22:37.000188 [0x7fed8cff5700] [WARN ]: RedimensionCommon::redimensionArray: Data collision is detected at cell position {20190715135704, 10, 0, 0} for attribute ID = 1. Add log4j.logger.scidb.array.RedimensionCommon=TRACE to the log4cxx config file for more
2020-01-10 20:22:40.000708 [0x7fed8b6dc700] [INFO ]: QueryTiming 0 0.1578655353183981929 TOT 7.524482 ACT 7.518721 CPU 2.468422 wPG 0.000000 wFSr 0.000000 wFSw 0.000000 wFSws 0.000000 wFSf 0.000000 wSMld 0.000000 wBFrd 0.000000 wSMcm 0.000000 wREP 0.000000 wNETr 0.000000 wNETs 0.000000 wNETrr 0.000000 wNETrc 0.000000 wSGr 0.000000 wSGb 0.000000 wBAR 0.000000 wJsrt0.000000 wEXT 0.000000 wSEMo 0.000000 wEVo 0.000000 wLTCH 0.000000 wRare 0.000000 wZero 0.000000 OTH% 67.2 insert(redimension(apply(temp_26190to27931, date, 20190715135704, utm_zone, 10, latitude, x, longitude, y, elevation, b1), KOMPSAT_10_10, false), KOMPSAT_10_10)
2020-01-10 20:22:40.000734 [0x7fed8e308700] [INFO ]: Executing query(id:0.1578655360721938002, user:scidbadmin, ns:public): remove(temp_26190to27931); from program: 127.0.0.1:43922 /opt/scidb/18.1/bin/shim shim -s 1239 -t /tmp -m 2048 -o 36000 -i 0 ;
2020-01-10 20:22:40.000812 [0x7fed8e308700] [INFO ]: QueryTiming 0 0.1578655360721938002 TOT 0.089977 ACT 0.087006 CPU 0.004721 wPG 0.000000 wFSr 0.000000 wFSw 0.000000 wFSws 0.000000 wFSf 0.000000 wSMld 0.000000 wBFrd 0.000000 wSMcm 0.000000 wREP 0.000000 wNETr 0.000000 wNETs 0.000000 wNETrr 0.000000 wNETrc 0.000000 wSGr 0.000000 wSGb 0.000000 wBAR 0.000000 wJsrt0.000000 wEXT 0.000000 wSEMo 0.000000 wEVo 0.000000 wLTCH 0.000000 wRare 0.000000 wZero 0.000000 OTH% 94.6 remove(temp_26190to27931)
2020-01-10 20:22:40.000812 [0x7fed9e896fc0] [ERROR]: Dropping mtCompleteQuery for queryID=0.1578655360721938002, from CLIENT because SystemException in file: src/query/Query.cpp function: getQueryByID line: 1078
Error id: scidb::SCIDB_SE_QPROC::SCIDB_LE_QUERY_NOT_FOUND
Error description: Query processor error. Query 0.1578655360721938002 not found.
{code}

Hi,
In the log snippet you provided, I see “Dropping mtCompleteQuery for queryID” and then an identifier for a query (e.g., 0.1578655352985201930). This could happen when another client issues a cancel() query for that ID in another session. This would stop the query matching the ID and cause it not to be found.
Are you running iquery directly? Or are you using SciDBR or SciDB-Py?
Can you please send along your config.ini?
Thanks,
Dave

Hello. Thank you for your answer.

Regarding your questions:
Data redimension is done using iquery and os system call from Python. SciDB-Py is not used in current part.
After redimension is completed, original arrays are dropped using SciDB-Py in the same Python code.
From the log it seems like the array cannot be dropped but we still don’t understand why.

Here is the config.ini

/opt/scidb/18.1/etc/config.ini

[mydb]
server-0=127.0.0.1,16
install_root=/opt/scidb/18.1
pluginsdir=/opt/scidb/18.1/lib/scidb/plugins
logconf=/opt/scidb/18.1/share/scidb/log4cxx.properties
db_user=scidb_pg_user
base-port=1239
base-path=/vm/scidb_data
redundancy=0
security=trust

target-cells-per-chunk=100000
mem-array-threshold=1024
smgr-cache-size=256
merge-sort-buffer=512
chunk-size-limit-mb=128

execution-threads=48
result-prefetch-threads=30
result-prefetch-queue-size=1
operator-threads=4

admin-queries=1
client-queries=47

sg-send-queue-size=16
sg-receive-queue-size=16

max-arena-page-size=8

And here are shim service settings:

/var/lib/shim/conf

# Shim configuration file
# Uncomment and change any of the following values. Restart shim for
# your changes to take effect (default values are shown). See
# man shim
# for more information on the options.

#ports=8080,8083s
scidbhost=IP_address
scidbport=1239
instance=0
tmp=/tmp
user=
max_sessions=2048
timeout=36000
#max_sessions=50
#timeout=60
#aio=1

Hi,
Thanks for the quick reply. To better understand what’s going on, can you send along the script containing your workflow? It sounds like there are python and non-python pieces that are interacting and I need a clearer picture of what’s happening.
Thanks!
Dave

Hello. Here is how the data is being loaded and where the error happens:

# 1. Redimension data and save it to array from temp array
qry_str = 'iquery --host IP_ADDRESS -n -q SET LANG AFL; insert(redimension(apply(temp_array_name, date, 20190101, latitude, x, longitude, y, elevation, b1), array)name, false), array_name)'
os.system(qry_str)

# 2. Remove temp array
scidb.remove(temp_array_name)

Originally there are multiple temp arrays and the data from all of them is saved to the final array in multi-dimensional format. So he operations #1 and #2 are performed in the Python loop for each temp array separately.

Thanks for the follow-up. It looks like the redimension is failing on a data collision. In the log, there’s the error “Data collision is detected at cell position {20190715135704, 10, 0, 0} for attribute ID = 1”. If you run this on the shell, outside your python program, you’ll see the error as a response on the terminal.

iquery "insert(redimension(apply(temp_22698to24443, date, 20190715135704, utm_zone, 10, latitude, x, longitude, y, elevation, b1), KOMPSAT_10_10, false), KOMPSAT_10_10)"

If you send along the schemas for your arrays, I can help you create a redimension query that won’t collide your data.

Thanks,
Dave

Hello. Than you for a fast reply.
Here are the array schemas you were asking about:

Temp array schema:
<x:int32, y:int32, b:uint16>[i=0:*:0:*]

Result array schema:
<band:int32>[date=0:*:0:1; latitude=min_int32:*:0:1000; longitude=min_int32:*:0:1000]

Before we tried to do the redimensioning by SciDB-py but it did not work. Shim service failed during the process, that is why we use iquery call from Python program. If you have any recommendations about that problem also, please let me know.

Thank you.

In the apply operator, I see elevation added as a new attribute, based on b1. But I don’t see b1 in either of these array schemas that you’ve given (I do see b). Is there a typo somewhere? Either in the original redimension query or in the schemas you’ve given me?

Separately, redimension produces a result array using some or all variables of a source array, potentially changing some or all of those variables from dimensions to attributes or vice versa. So it works best when these things are called-out by name. If elevation from the source array is meant to become band in the destination array, then change elevation to band in apply.

Hello.
Sorry the array schemas were not matching the redimension query because we use different array schemas. Here are the correct array schemas:

Temp array schema:
<x:int32, y:int32, b1:uint16>[i=0:*:0:*]

Result array schema:
<elevation:int32>[date=0:*:0:1; latitude=min_int32:*:0:1000; longitude=min_int32:*:0:1000; utm_zone=-100:100:0:1]

So as you said, we change b1 in temp arry to elevation in result array using apply operator.

But I have one more question. In temp array the pairs (x, y) are supposed to be unique, so how is it possible that the collision happens, especially because I specify False in redimension operator, which should ignore the collision if it happens.

And more more thing, after the collisions happen, there is a problem with dropping of temp array, which you can see in the following log. Is it somehow connected to collision problem or there is something else we should think about?

**2020-01-10 16:12:18.000212 [0x7fed8c1e7700] [INFO ]: Executing query(id:0.1578640338202299364, user:scidbadmin, ns:public): remove(temp_26085to27815); from program: 127.0.0.1:43922 /opt/scidb/18.1/bin/shim shim -s 1239 -t /tmp -m 2048 -o 36000 -i 0 ;**
2020-01-10 16:12:18.000278 [0x7fed8c1e7700] [INFO ]: QueryTiming 0 0.1578640338202299364 TOT 0.076179 ACT 0.073692 CPU 0.006394 wPG 0.000000 wFSr 0.000000 wFSw 0.000000 wFSws 0.000000 wFSf 0.000000 wSMld 0.000000 wBFrd 0.000000 wSMcm 0.000000 wREP 0.000000 wNETr 0.000000 wNETs 0.000000 wNETrr 0.000000 wNETrc 0.000000 wSGr 0.000000 wSGb 0.000000 wBAR 0.000000 wJsrt0.000000 wEXT 0.000000 wSEMo 0.000000 wEVo 0.000000 wLTCH 0.000000 wRare 0.000000 wZero 0.000000 OTH% 91.3 remove(temp_26085to27815)
**2020-01-10 16:12:18.000279 [0x7fed9e896fc0] [ERROR]: Dropping mtCompleteQuery for queryID=0.1578640338202299364, from CLIENT because SystemException in file: src/query/Query.cpp function: getQueryByID line: 1078**
**Error id: scidb::SCIDB_SE_QPROC::SCIDB_LE_QUERY_NOT_FOUND**
<b>**Error description: Query processor error. Query 0.1578640338202299364 not found.**

Thank you.

You can avoid the cell collision problem by providing a synthetic dimension for the destination schema in redimension. Without the synthetic dimension, the result is that one of the colliding cells is chosen, however, there is no way to tell which cell is picked. By adding a synthetic dimension – a dimension whose name is different from any source array attribute or dimension – you instruct redimension to resolve cell collisions by storing all colliding cells along the synthetic dimension. You’ll have to update the schema for the destination array to include some dimension not named from the input, such as synth or z or something else.

Without seeing your python script, I can’t say why your array removes are not working. Are the arrays still appearing in list() after the failed remove()?