> For the complete documentation index, see [llms.txt](https://nag-9-s.gitbook.io/hdfs-notes/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://nag-9-s.gitbook.io/hdfs-notes/change-block-size-of-existing-files-in-hadoop.md).

# Change Block size of existing files in Hadoop

<https://stackoverflow.com/questions/29604823/change-block-size-of-existing-files-in-hadoop?rq=1>

Consider a hadoop cluster where the default block size is 64MB in`hdfs-site.xml`. However, later on the team decides to change this to 128MB. Here are my questions for the above scenario?

1. Will this change require restart of the cluster or it will be taken up automatically and all new files will have the default block size of 128MB?
2. What will happen to the existing files which have block size of 64M? Will the change in the configuration apply to existing files automatically? If it will be automatically done, then when will this be done - as soon as the change is done or when the cluster is started? If not automatically done, then how to manually do this block change?

Will this change require restart of the cluster or it will be taken up automatically and all new files will have the default block size of 128MB

A restart of the cluster will be required for this property change to take effect.

> What will happen to the existing files which have block size of 64M? Will the change in the configuration apply to existing files automatically?

Existing blocks will not change their block size.

> If not automatically done, then how to manually do this block change?

To change the existing files you can use distcp. It will copy over the files with the new block size. However, you will have to manually delete the old files with the older block size. Here's a command that you can use

```
hadoop distcp -Ddfs.block.size=XX /path/to/old/files /path/to
ew/files/with/larger/block/sizes.
```

As mentioned[here](https://stackoverflow.com/a/28590163/3831557)for your point:

1. Whenever you change a configuration, you need to restart the NameNode and DataNodes in order for them to change their behavior.
2. No, it will not. It will keep the old block size on the old files. In order for it to take the new block change, you need to rewrite the data. You can either do a hadoop fs -cp or a distcp on your data. The new copy will have the new block size and you can delete your old data.

check link for more information.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://nag-9-s.gitbook.io/hdfs-notes/change-block-size-of-existing-files-in-hadoop.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
