Improving Data Confidentiality by Using Access Structures Over Data Deduplication in Cloud 1M DEVIKA
Improving Data Confidentiality by Using Access Structures Over Data Deduplication in Cloud
1M DEVIKA, 2V. SESHA BHARGAVI
1M. Tech Student, Department of CSE, G. Narayanamma Institute of Technology and Science (GNITS), Shaikpet, Hyderabad, India
2Assistant Professor, Department of CSE, G. Narayanamma Institute of Technology and Science (GNITS), Shaikpet, Hyderabad, India
ABSTRACT? Due to the rapid improvement of data sharing over cloud, users are focusing towards it. It is necessary to keep data safe and confidential. An attribute-based encryption (ABE) is used in which for every data which is being uploaded is provided with access policies using a set of attributes and only the users satisfying the access policies can decrypt the data. But the standard ABE system doesn’t provide secure deduplication – a technique which eliminates duplicate copies of identical data, so this helps in reducing storage space, network bandwidth and processing time. In this paper, we present an attribute-based storage system which supports secure deduplication using a hybrid cloud in which a public cloud is used for data storage and a private cloud is for detecting duplicate data. The present proposed system has following advantages: The system provides confidentiality by providing access policies to the users rather than designating decryption keys. Secondly, this system provides security for data confidentiality, but the existing systems provides weak security. We use an approach to alter a ciphertext over one access policy into ciphertexts of
same plaintext under other access policies without revealing the plaintext.
Keywords: Cloud Storage, CP-ABE, Attribute-Based Encryption, Deduplication.
As the usage of cloud increased due to the advantages provided by it therefore abundant data is stored over the cloud, it is responsibility of cloud service providers to manage the efficient storage of data and to keep the data secure. Therefore, data deduplication is being used for efficient storage of data. Data deduplication is a technique that has been adopted to minimize the large redundant data to diminish the storage space. The basic idea is to store only a single copy of data and eliminating repeating data. ABE is an encryption technique used to facilitate data confidentiality by specifying access policies to users rather than defining decryption keys. In ABE, when a data provider uploads a data he mentions access policies for the data with an attribute set that who can access the data. When a user wants to download the data, he should specify his credentials/attributes if his attributes matches then he can access the data.
Fig 1. Deduplication
Figure 1 explains about the data deduplication. Without deduplication same data elements are stored multiple times which utilizes maximum storage space while performing data deduplication only single identical copies are being stored and if a duplicate copy is being identified then a pointer is pointed towards it by not storing the file again. This can minimize redundant data and maximize space savings in cloud storage.
2. RELATED WORK
Sahai and Waters 6 proposed the attribute-based Encryption (ABE) notion and then Goyal 16 formulated forms of ABE, Key-policy ABE (KP-ABE) and Ciphertext-policy ABE (CP-ABE). The existing systems were not designed with secure deduplication and ABE. To save the storage space for cloud, Douceur at al. 23 firstly proposed a solution called convergent encryption for both confidentiality and efficiency in deduplication. In convergent encryption, a message is converted to ciphertext with a key derived from the message therefore identical plaintexts will have same ciphertexts. But the cloud server can find out the equal ciphertexts. So, to strengthen the security Abadi et al. 9 proposed plaintext distributions based on public parameters of schemes. When a data provider ‘X’ wants to upload a file ‘F’ to the cloud he must encrypt ‘F’ with a set of attributes under an access policy ‘A’ and needs to upload the file into the cloud so that users satisfying the access policies can only decipher the data. Later, if another data provider ‘Y’ wants to upload the same file ‘F’ but with different access policy ‘B’. Here the same files are encrypted under different access policies, but the cloud will not be able to deduplicate it and stores the same file twice and uses the excess space for storing.
This system can provide security by using hybrid cloud architecture and provides confidentiality by using access policies instead of decryption keys. The proposed system uses CP-ABE (ciphertext-policy ABE) in which encrypted data is stored with access policies and users satisfying can decrypt the data and this system performs deduplication securely. Figure 2 shows the system architecture which consists of four entities: users, cloud, data providers, attribute authority (AA). A data provider outsources data to the cloud and shares it with the users by specifying access policies with certain credentials. The cloud which consists of a public and private cloud where private cloud does tag checking and public cloud stores the data. The decryption keys associated with set of attributes are generated by attribute authority to the users. The data provider sends a request for file storage to the cloud along with the access structure. The private cloud after receiving the request for storage, it performs equality testing of tags with the tags stored in the system. If there is no match found for the tag it adds the tag and stores the file and tag in the public cloud. If there is a match found, then it only stores a single file of it with the existing tag. Even though if same ciphertexts are uploaded by different users with different access policies, then for the same plaintext it stores the ciphertext in the public cloud with the union of two access policies. The user can access with the private key generated by the attribute authority if his attributes satisfies the access structure and can download, decrypt the ciphertext file.
Fig 2. Proposed System Architecture
This system helps cloud service providers to manage the storage efficiently and provides security, data confidentiality.
4. RESULTS AND DISCUSSION
Java language is being used and implemented in amazon cloud. Amazon EC2(Elastic Compute Cloud) is a webservice which is adapted to make developers easy to access web-scale cloud computing. It is secure and has resizable compute capacity in the cloud. Table 1 shows the file sizes, Figure 3 corresponds to before deduplication and after deduplication using ABE. By using attribute-based storage the space is saved. The attribute-based storage can even deduplicate the files with identical content with different file names and different access policies.
FILE NAME FILE SIZE(KB)
File 1 with A 100
File 1 with A’ 100
File 3 50
File 4 40
File 5 65
File 6 80
Table 1 – File names with File size (in KB)
Fig 3. Storage efficiency of existing vs proposed system
The figure 3 shows the storage efficiency of existing system with the proposed system. In figure 3 the file1 is stored twice in the existing system since the same file is uploaded with different access policies. But even if the same file is uploaded twice with different access policies the proposed system will store it only once therefore the space is being reduced.
Total storage space in existing system = (File 1 with A + File 1 with A’ + File 3 + File 4 + File 5 +File 6)
= 100+100+50+40+65+80 = 435 (in KB)
Total storage space in proposed system = (File 1 with A + File 1 with A’ + File 3 + File 4 + File 5 +File 6)
= 100+0+50+40+65+80 = 335 (in KB)
Total Savings = Storage space in existing – storage space in proposed system = 435 – 335 (in KB)
= 100 KB
Total storage space saved is 100 KB.
Fig 4. File processing time of existing vs proposed system
Figure 4 shows the comparison between the file processing time of existing and proposed system.
Fig 5. Attribute size vs Encryption time
Figure 5 shows that as the attributes size increases then the time taken for encryption also increases. Therefore, the proposed system is efficient to manage the storage space by performing deduplication. It also provides data confidentiality and security for the users accessing cloud services.
5. CONCLUSION AND FUTURE SCOPE
This paper aims to eliminate identical copies of repeating data to save storage space. In cloud computing, Attribute-based encryption (ABE) has been used in which data providers can share the data with the users satisfying the credentials specified. Deduplication also saves network bandwidth. The standard systems cannot support deduplication securely. The proposed system is built on a hybrid cloud architecture where private cloud is for tag checking and public cloud provides cloud storage. It also provides data confidentiality by specifying access policies rather than decryption keys. The system can also deduplicate a file with same content, same content with different file names, and with different access policies, while the existing systems cannot perform it. In amazon cloud storage, the deduplication is implemented. Future the deduplication can be implemented as a block level which provides storage more effectively than the file level.