0

I found an anomaly in the block size data: block 295182. This one seems to have a way higher payload. What happened there?

Bitcoin Block Size Or is this there something wrong with my parsing? This file block does appear to be at the end of a .dat file.

Here's the code I used:

import struct
import matplotlib.pyplot as plt

blockchain_path = 'mypath/blocks'

def get_block_path(index):
    file = 'blk' + str(index).zfill(5) + '.dat'
    return blockchain_path + '/' + file

def plot(x, y, title, xlabel, ylabel):
    plt.scatter(x, y)
    plt.suptitle(title, fontsize=20)
    plt.xlabel(xlabel)
    plt.ylabel(ylabel)
    plt.show()

def uint4(stream):
    return struct.unpack('I', stream.read(4))[0]

def blocks_remaining(f, i):
    cur = f.tell()
    try:
        uint4(f)
        f.seek(cur)
        return True
    except:
        print("End of file: blk" + str(i).zfill(5) + '.dat')
        return False

BlockHeight, BlockSize = ([] for i in range(2))
cur_block_height = -1


for i in range(0, 100):
    block_path = get_block_path(i)
    #stay with single file
    f = open(block_path, 'rb', buffering=16 * 1024 * 1024)
  

  #loop blocks in file
    while blocks_remaining(f, i):
        cur_block_height += 1
        magic_num = uint4(f)
        block_size = uint4(f)

        print(cur_block_height, f.tell(), magic_num, block_size)

        #append data
        BlockHeight.append(cur_block_height)
        BlockSize.append(block_size)

        #skip to next block
        f.read(block_size)

plot(BlockHeight, BlockSize, 'Block Size', 'block height', 'bytes')
  • Coud their be some suffix to the block files after the payload of the last block of a dat file? – Bitcoingraffiti Apr 22 '21 at 13:04
  • 1
    block 295182 is 149,287 byte as you can see here https://blockchair.com/bitcoin/block/000000000000000043759025a93fbf64c13f581fdac9e678b5bbcf5b62d42790?_type=block&_search=header – leevancleef Apr 22 '21 at 13:05
  • @leevancleef I'm not trusting, just verifying. I have other values. If you claim theirs is right, then where is my mistake? – Bitcoingraffiti Apr 22 '21 at 13:13
  • 1
    you can check it on your node with "bitcoin-cli getblock 000000000000000043759025a93fbf64c13f581fdac9e678b5bbcf5b62d42790". if I knew where your mistake is, I would have answered :) – leevancleef Apr 22 '21 at 13:25
  • If I'm reading your graph correctly, your code found a block that is around 2.4 Gigabytes in size. That's too large for a valid Bitcoin block. You might want to validate magic_num == 0xD9B4BEF9 and block_size <= 1M in your code – RedGrittyBrick Apr 22 '21 at 13:33
  • The magic numbers line up, so the parsing seems to work fine. I guess there might be an issue with the end of file situation. Though also for blocks within blk00000.dat I find different values for the block size against bitcoincli and blockchain.com. Very strange as I perfectly parse out the magic number each time and the block size is right after it... – Bitcoingraffiti Apr 22 '21 at 13:48
  • Can you post a sample of the actual block sizes you are finding? Particularly for the outlier block that you have found too. – Andrew Chow Apr 22 '21 at 16:46
  • @AndrewChow You're right! The magic number is screwed up there! Still I get different values in blocksizes between mine and bitcoin-cli. Investigating it now. – Bitcoingraffiti Apr 22 '21 at 17:20
  • Apparently from block 353 the data is off. I just checked the merkleroots and starting from that block the data is not correct. – Bitcoingraffiti Apr 22 '21 at 21:34
  • From this point my blockheight + 30 is the right block. Could it be that the data in my blk files is not sorted on height? Because now I assign height as they come in linear order. – Bitcoingraffiti Apr 23 '21 at 09:13
  • Ok. I assumed blocks were ordered by height in the data. This caused the problem. – Bitcoingraffiti Apr 24 '21 at 20:39

0 Answers0