Download Virginia加密算法

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
多表代换Virginia加密算法及秘钥
破解算法的实现
方贤进
http://star.aust.edu.cn/~xjfang
Email: [email protected]
Virginia加密算法、解密算法
Virginia加密算法
• 假设语言的字符集为
Charset[26]={‘a’, ’b’, …, ’z’}
字符集大小=26
• 对应的字符编码为
Coding[26]={0, 1, …, 25}
Virginia加密算法
• Virginia加密算法是对明文进行加密的
过程中依照密钥的指示轮流使用多个
单表代替密码。
• 设明文串为:
M=m1m2…mn,mi∈charset, n是明文长度
• 秘钥为:
K=k1k2…kd,ki∈charset, d是秘钥长度
• 密文为:
C=c1c2…cn,ci∈charset, n是密文长度
Virginia加密算法
• 加密算法:
cj+td=(mj+td+kj ) mod 26
j=1…d, t=0…ceiling(n/d)-1
其中ceiling(x)函数表示不小于x最小整数
• 解密算法:
mj+td=(cj+td -kj ) mod 26
j=1…d, t=0…ceiling(n/d)-1
其中ceiling(x)函数表示不小于x最小整数
Virginia加密算法举例
m1
m2
m3
m4
o
t
h
明文M n
(编码) (13) (14) (19) (7)
m5
m6
m7
m8
m9
m10
m11
i
n
g
i
s
t
(8) (13) (6) (8) (18) (19)
o
(14)
秘钥K
(编码)
j
o
y
j
o
y
j
o
y
(9) (14) (24) (9) (14) (24) (9) (14) (24)
j
(9)
o
(14)
密文C
(编码)
w
c
r
q
w
l
p
w
q
(22) (2) (17) (16) (22) (11) (15) (22) (16)
c
(2)
c
(2)
j=1 j=2 j=3 j=1 j=2 j=3 j=1 j=2 j=3
t=0 t=0 t=0 t=1 t=1 t=1 t=2 t=2 t=2
j=1
t=3
j=2
t=3
明文长度n=11,秘钥长度d=3,
t=ceiling(11/3)-1=3
一个原始的明文文本
Differential Privacy is the state-of-the-art goal for the problem of privacy-preserving data release
and privacy-preserving data mining. Existing techniques using differential privacy, however,
cannot effectively handle the publication of high-dimensional data. In particular, when the input
dataset contains a large number of attributes, existing methods incur higher computing
complexity and lower information to noise ratio, which renders the published data next to useless.
This proposal aims to reduce computing complexity and signal to noise ratio. The starting point is
to approximate the full distribution of high-dimensional dataset with a set of low-dimensional
marginal distributions via optimizing score function and reducing sensitivity, in which generation
of noisy conditional distributions with differential privacy is computed in a set of low-dimensional
subspaces, and then, the sample tuples from the noisy approximation distribution are used to
generate and release the synthetic dataset. Some crucial science problems would be investigated
below: (i) constructing a low k-degree Bayesian network over the high-dimensional dataset via
exponential mechanism in differential privacy, where the score function is optimized to reduce
the sensitivity using mutual information, equivalence classes in maximum joint distribution and
dynamic programming; (ii)studying the algorithm to compute a set of noisy conditional
distributions from joint distributions in the subspace of Bayesian network, via the Laplace
mechanism of differential privacy. (iii)exploring how to generate synthetic data from the
differentially private Bayesian network and conditional distributions, without explicitly
materializing the noisy global distribution. The proposed solution may have theoretical and
technical significance for synthetic data generation with differential privacy on business prospects.
经过预处理之后的明文文本
(只保留字符集中的字符)
differentialprivacyisthestateoftheartgoalfortheproblemofprivacypreservingdatarelease
andprivacypreservingdataminingexistingtechniquesusingdifferentialprivacyhoweverca
nnoteffectivelyhandlethepublicationofhighdimensionaldatainparticularwhentheinputd
atasetcontainsalargenumberofattributesexistingmethodsincurhighercomputingcomple
xityandlowerinformationtonoiseratiowhichrendersthepublisheddatanexttouselessthisp
roposalaimstoreducecomputingcomplexityandsignaltonoiseratiothestartingpointistoap
proximatethefulldistributionofhighdimensionaldatasetwithasetoflowdimensionalmargi
naldistributionsviaoptimizingscorefunctionandreducingsensitivityinwhichgenerationof
noisyconditionaldistributionswithdifferentialprivacyiscomputedinasetoflowdimensiona
lsubspacesandthenthesampletuplesfromthenoisyapproximationdistributionareusedto
generateandreleasethesyntheticdatasetsomecrucialscienceproblemswouldbeinvestiga
tedbelowiconstructingalowkdegreebayesiannetworkoverthehighdimensionaldatasetvi
aexponentialmechanismindifferentialprivacywherethescorefunctionisoptimizedtoredu
cethesensitivityusingmutualinformationequivalenceclassesinmaximumjointdistributio
nanddynamicprogrammingiistudyingthealgorithmtocomputeasetofnoisyconditionaldis
tributionsfromjointdistributionsinthesubspaceofbayesiannetworkviathelaplacemechan
ismofdifferentialprivacyiiiexploringhowtogeneratesyntheticdatafromthedifferentiallypr
ivatebayesiannetworkandconditionaldistributionswithoutexplicitlymaterializingthenois
yglobaldistributiontheproposedsolutionmayhavetheoreticalandtechnicalsignificancefo
rsyntheticdatagenerationwithdifferentialprivacyonbusinessprospects
经过virginia加密后的密文
加密秘钥key=infosec
lvktwvgvgnodttqifqqmubujglevmbkhziczglcsphweyvwttwoqseshxenjsgaxejgwvxqalrsxczrqsswgiaid
jmxipddjiumeawfkfigfaarkvtjlawvqalhwgjvvviwwwavsuvmhnrwsfxkiyufazcklmcoixmehofrqbrktwg
vqijzqlcvqqsllgxhgzagcbvtbgjjqtmraqgvfncfenlnyoarrieywuyniebvwrvprnbhyvlnyokivkbshsmpanqo
jkgvhrpwvqnnyhjmdcgjgwbkagnbyqgbutrkmpkhwvakjmehcetwbvsuusoxyjlaxaiaizgagzvstgvoigncf
xqvbngwvcbvtkzmepejbvitagmshydtvxvwhfigfbwbvbbzgwpgafyvawrzbuckenivrglstmqzqwgquczha
rikbrddizqgdofhuqtsodxqvbngwvcbvthziubnwharixbnblmubbfdhvqfvrolivprkidpfqfyfafwbvtbgjjqt
mraqgvfncfenlnyokivevyvswgbbkzgafqzjbkmqvnqasviqafzvmubenpmxkwaxjaeqxgnaadkvtxqgvgnh
sqlmqvnsrjifcpnbywgvfnhazkblnbolkkulsfitigncfshvbngqgqvqnhaspiyiwkxtqozhaspajnhzhknsjfwrv
qnqdjmxipdwkgquczhwhkvnxslshtbbraqgvfncfenahggheemffbvxjmayvwwcucqslyrtrxtjsobujbgmu
gnudjszqzfhasplvxhjmdcgncfetmhxsvxqorssjevmnsrjinmnxsllgalshzivqpioleumgxceiezhhwspukvjbu
irzbgzwquebzzvfgqaaskxkonysvfgtbbwuspagwiuxkvtfzgamlrlfwidiljgaepvrykgvmwijfllgpvlvvmomax
wgrctqfhswgbinowbrwajblmctzjqzepqfrwfhknsjfwrvqnqdjmxipdkzitmgmskgqzrkifgvqbswksrbvrwr
ifbbwsvyemgmskipavywnmvghxwfkocgzodmpnbwasxkwajemmxiyjbuietnxgwwkvzflaqwuwtwfxfqf
yfafwbvtbsrfllsoemexetujeouvsuamubhimaribujodkqzvyvexqkbrdmxgifjhgjpwvxmusplvywgrctqng
lvkjhywgrunetabskvgiwkxtqozhaspavshziucoxdsggwsgoqiuqnsbwxywepjaevprqohpckrrsulcvvxagjf
qsksjipbvfzhvkdnhmamkmkuzgvkvtmcoxqorssjevmfdbllgbvhrsxcnetallglvktwvgvgnodpaxenjsxgjnd
skmcvajhostsnsrusplvywgrctqnglvkjhywgruevyvgyvmkuzagkbydasxgzvfzadkvtyvwrqqfdudsdiyiwkx
tqozhaspbujdjsrwfjrksncgncfqcgufjwxjmbwslmeiyfbvxgkuswuenavlbajkknsqwjqzfdbllgbvhrsxcorss
jevqbskaxjlvktwvgvgnodttqifqqspjhxwfiuacwcktgkgxvzlj
Virginia加密秘钥的破解
——唯密文攻击
概念:重合指数及其无偏估计值
• 重合指数:设某种语言由n个字母组成,每个
字母i发生的概率为pi(1≤i≤n),则重合指数就是
指两个随机字母相同的概率,记为IC
n
IC   pi
i 1
• 一般用IC的无偏估计值IC’来近似计算IC. 其中
的xi表示字母i出现的频次,L表示文本长度,n
表示某种语言中包含的字母数。
xi ( xi  1)
IC '  
i 1 L( L  1)
n
IC’值的三大特点
1. 随机英文文本的IC’总是大约为0.038.
2. 而一段有意义的英文文本的IC’总是大约为
0.065.
3. 对明文进行移位加密后形成的密文,其IC’之
值不改变!
这是3个非常重要的结论!
可通过下面的实验加以验证。
Example 1:
一个随机英文文本明文及其IC’
其IC’为0.0388
对以上的随机英文文本明文采用移
位加密(key=17)后的密文及其IC’
密文的IC’也为0.0388
Example 2:
一个有意义的英文text
•
Differential Privacy is the state-of-the-art goal for the problem of privacy-preserving data release
and privacy-preserving data mining. Existing techniques using differential privacy, however, cannot
effectively handle the publication of high-dimensional data. In particular, when the input dataset
contains a large number of attributes, existing methods incur higher computing complexity and
lower information to noise ratio, which renders the published data next to useless. This proposal
aims to reduce computing complexity and signal to noise ratio. The starting point is to approximate
the full distribution of high-dimensional dataset with a set of low-dimensional marginal
distributions via optimizing score function and reducing sensitivity, in which generation of noisy
conditional distributions with differential privacy is computed in a set of low-dimensional
subspaces, and then, the sample tuples from the noisy approximation distribution are used to
generate and release the synthetic dataset. Some crucial science problems would be investigated
below: (i) constructing a low k-degree Bayesian network over the high-dimensional dataset via
exponential mechanism in differential privacy, where the score function is optimized to reduce the
sensitivity using mutual information, equivalence classes in maximum joint distribution and
dynamic programming; (ii)studying the algorithm to compute a set of noisy conditional
distributions from joint distributions in the subspace of Bayesian network, via the Laplace
mechanism of differential privacy. (iii)exploring how to generate synthetic data from the
differentially private Bayesian network and conditional distributions, without explicitly materializing
the noisy global distribution. The proposed solution may have theoretical and technical significance
for synthetic data generation with differential privacy on business prospects.
其重合指数的无偏估计值IC’为:0.0659
假设Virginia加密是针对有意义的英文文
本加密,那么如何对用Virginia多表代换
加密之后的密文进行破解呢?
(唯密文攻击)
step1:估算Virginia多表代换加密的秘钥
长度
step2:再计算秘钥中的每个字符
经过预处理之后的明文文本
(只保留字符集中的字符)
differentialprivacyisthestateoftheartgoalfortheproblemofprivacypreservingdatarelease
andprivacypreservingdataminingexistingtechniquesusingdifferentialprivacyhoweverca
nnoteffectivelyhandlethepublicationofhighdimensionaldatainparticularwhentheinputd
atasetcontainsalargenumberofattributesexistingmethodsincurhighercomputingcomple
xityandlowerinformationtonoiseratiowhichrendersthepublisheddatanexttouselessthisp
roposalaimstoreducecomputingcomplexityandsignaltonoiseratiothestartingpointistoap
proximatethefulldistributionofhighdimensionaldatasetwithasetoflowdimensionalmargi
naldistributionsviaoptimizingscorefunctionandreducingsensitivityinwhichgenerationof
noisyconditionaldistributionswithdifferentialprivacyiscomputedinasetoflowdimensiona
lsubspacesandthenthesampletuplesfromthenoisyapproximationdistributionareusedto
generateandreleasethesyntheticdatasetsomecrucialscienceproblemswouldbeinvestiga
tedbelowiconstructingalowkdegreebayesiannetworkoverthehighdimensionaldatasetvi
aexponentialmechanismindifferentialprivacywherethescorefunctionisoptimizedtoredu
cethesensitivityusingmutualinformationequivalenceclassesinmaximumjointdistributio
nanddynamicprogrammingiistudyingthealgorithmtocomputeasetofnoisyconditionaldis
tributionsfromjointdistributionsinthesubspaceofbayesiannetworkviathelaplacemechan
ismofdifferentialprivacyiiiexploringhowtogeneratesyntheticdatafromthedifferentiallypr
ivatebayesiannetworkandconditionaldistributionswithoutexplicitlymaterializingthenois
yglobaldistributiontheproposedsolutionmayhavetheoreticalandtechnicalsignificancefo
rsyntheticdatagenerationwithdifferentialprivacyonbusinessprospects
经过virginia加密后的密文
加密秘钥key=infosec
lvktwvgvgnodttqifqqmubujglevmbkhziczglcsphweyvwttwoqseshxenjsgaxejgwvxqalrsxczrqsswgiaid
jmxipddjiumeawfkfigfaarkvtjlawvqalhwgjvvviwwwavsuvmhnrwsfxkiyufazcklmcoixmehofrqbrktwg
vqijzqlcvqqsllgxhgzagcbvtbgjjqtmraqgvfncfenlnyoarrieywuyniebvwrvprnbhyvlnyokivkbshsmpanqo
jkgvhrpwvqnnyhjmdcgjgwbkagnbyqgbutrkmpkhwvakjmehcetwbvsuusoxyjlaxaiaizgagzvstgvoigncf
xqvbngwvcbvtkzmepejbvitagmshydtvxvwhfigfbwbvbbzgwpgafyvawrzbuckenivrglstmqzqwgquczha
rikbrddizqgdofhuqtsodxqvbngwvcbvthziubnwharixbnblmubbfdhvqfvrolivprkidpfqfyfafwbvtbgjjqt
mraqgvfncfenlnyokivevyvswgbbkzgafqzjbkmqvnqasviqafzvmubenpmxkwaxjaeqxgnaadkvtxqgvgnh
sqlmqvnsrjifcpnbywgvfnhazkblnbolkkulsfitigncfshvbngqgqvqnhaspiyiwkxtqozhaspajnhzhknsjfwrv
qnqdjmxipdwkgquczhwhkvnxslshtbbraqgvfncfenahggheemffbvxjmayvwwcucqslyrtrxtjsobujbgmu
gnudjszqzfhasplvxhjmdcgncfetmhxsvxqorssjevmnsrjinmnxsllgalshzivqpioleumgxceiezhhwspukvjbu
irzbgzwquebzzvfgqaaskxkonysvfgtbbwuspagwiuxkvtfzgamlrlfwidiljgaepvrykgvmwijfllgpvlvvmomax
wgrctqfhswgbinowbrwajblmctzjqzepqfrwfhknsjfwrvqnqdjmxipdkzitmgmskgqzrkifgvqbswksrbvrwr
ifbbwsvyemgmskipavywnmvghxwfkocgzodmpnbwasxkwajemmxiyjbuietnxgwwkvzflaqwuwtwfxfqf
yfafwbvtbsrfllsoemexetujeouvsuamubhimaribujodkqzvyvexqkbrdmxgifjhgjpwvxmusplvywgrctqng
lvkjhywgrunetabskvgiwkxtqozhaspavshziucoxdsggwsgoqiuqnsbwxywepjaevprqohpckrrsulcvvxagjf
qsksjipbvfzhvkdnhmamkmkuzgvkvtmcoxqorssjevmfdbllgbvhrsxcnetallglvktwvgvgnodpaxenjsxgjnd
skmcvajhostsnsrusplvywgrctqnglvkjhywgruevyvgyvmkuzagkbydasxgzvfzadkvtyvwrqqfdudsdiyiwkx
tqozhaspbujdjsrwfjrksncgncfqcgufjwxjmbwslmeiyfbvxgkuswuenavlbajkknsqwjqzfdbllgbvhrsxcorss
jevqbskaxjlvktwvgvgnodttqifqqspjhxwfiuacwcktgkgxvzlj
step1:估算秘钥长度
(1)测试将密文分成2个子串,然后计算其IC’
的平均值;
(2)测试将密文分成3个子串,然后计算其IC’
的平均值;
……
(3)测试将密文分成n个子串,然后计算其IC’
的平均值;
如果在将密文分成d个子串时, 计算其IC’的平
均值近似为0.065,则Virginia加密的秘钥长度
为d。
Example: 将ciphertext分成2个子串
计算2个子串的重合指数无偏估计值的平
均值为IC=0.0419
Example: 将ciphertext分成3个子串
计算3个子串的重合指数无偏估计值的平
均值为IC=0.0419
Example: 依此类推,将ciphertext分成7个子串
子串1:
lvqbmzwwxxqziimivqvanikmbqvxbqvliiplkavncabkmbxizivbpatibazimukqqvbbxbfpqbqvlebqv
qbwxvnvcvbkivviqanqiuvtvammutbgqlcmommaqmzkzeqotavlivwpmtbwtqnqimzqbbmagcn
子串2:
witvuqblxubbzkiwltjnvqacwqwpkvqbdmvombnlvxjvsltjembzvqiqbwcgmikakzboqlvqjak
vgiubgeoeearapegtavvryleriqhvtfneernbnhngguhevyavgbvegvgbfbvqcbgtbvnbbvrfvtfnvbzna
eagthnpflugbqyojsnpcnbfhfacrunzvghrnnlpghvbbanbgtrlrivaqiazfsnpgrbvbgvhgbaynzwfvlev
子串3:
huvbfvvqhegovosnerrvsvnktrfvevgenanvqhvkyvtfyoufgubyuvnfvrbvgihcg
knfjklyqnjlqidafjlvswumhkjqgtmnyybnysqryjntwhsjisnntjmxfzyurzzrdsntwnfrkytmnyykjqfnxn
xssnnnlnnniznjqdzxbngfyqxjufxnxssxsixhjgzaybwfljyjlxfnjjrjqdmksrwmyxzwjjxftytstsijyrjxynyt
izsxgspqrxkfhumsdhtknndjsynyyudfydizjjnfwfslsdhssknfxwx
……
子串7:
gtuvchthaxcgxufkvjwhkcxqvcgcjgnrnvvvpgqdkgpjwoagoqcetdfvgrntqizuqcuiuqvfwjgnvgfqiuk
qkgqfgkkthqptpkvxqkhgnejcrouzpdtqvngvueurugkgpkmdpmgocgrcpkvxtqvrfepvopkxekwfwf
eouiqqgppckuktpuguyvccfpkkkqvgcggagctpckuvkgkqdtprncjegnkqgcvjgtpug
计算7个子串的重合指数无偏估计值的平
均值为IC=0.0657
将密文串划分成多个子串,分别求IC无偏估计
值平均值
子
串
数
子串1
子串2
串
长
子串3
1
1609
0.0419
2
805
0.0427
804
0.0411
3
537
0.0417
536
0.0417
536
0.0424
4
403
0.0425
402
0.0.98
402
0.0424
402
0.0427
5
322
0.0417
322
0.0414
322
0.0418
322
0.0413
321
0.0411
6
269
0.0402
268
0.0397
268
0.0441
268
0.0432
268
0.0419
268
0.0416
7
230
0.0674
230
0.0677
230
0.0621
230
0.0584
230
0.0744
230
0.0666
8
…
…
IC
串
长
子串6
IC
IC
串
长
子串5
串
长
IC
串
长
子串4
IC
串
长
IC
子串7
串
长
IC
平
均
IC
0.0419
0.0419
0.0419
0.0419
0.0415
0.0418
229
0.0634
0.0657
…
0.0422
因为有意义的英文文本的明文IC ≈ 0.065,而移位
加密不改变其IC值,所以对应的密文的IC ≈0.065。
通过上表可知秘钥长度d=7.
step2:计算秘钥中的每个字符
(1) 根据Virginia加密算法可知, 每个子串
中的密文字母都是对明文中的字母经过相同
的移位加密得到的,即第i(i=1…d)个子串是
用秘钥key中的第i个字符进行移位加密得到
的!移位加密的密钥空间仅为26。因此对每
个密文子串测试26次移位算法进行解密,每
次测试时计算该子串的拟重合指数,拟重合
指数最高的那次移位数(编码)就是该子串所
对应的Virginia加密密钥中的那个字母。
(2)对步骤(1)重复d次即可得到组成密钥
的所有字母。
拟重合指数
• 拟重合指数:设某种语言由n个字母组成,
每个字母i的统计概率为pi(i=1…n),每个字
母在密文子串Cj ( j=1…d)中出现的频次为fi,j ,
每个密文子串Cj的长度为ni,j ,则第j个子串
的拟重合指数定义为:
n
fi , j
i 1
ni , j
M j   pi *
, j  1d
明文中各个字母出现的统计概率(pi)
Example: 假如对密文子串3测试26次移位
算法进行解密
子串3:
移位数
密文子串3经过移位
加密后的拟重合指数
移位数
密文子串3经过移位
加密后的拟重合指数
1(b)
0.0387
14
0.0326
knfjklyqnjlqidafjlvswumhkjqgtmnyybnysqryjntwhsjisnntjmxfzyurzzrdsntwnf
rkytmnyykjqfnxnxssnnnlnnniznjqdzxbngfyqxjufxnxssxsixhjgzaybwfljyjlxfnjjrj
2(c)
0.0325
15
0.0348
qdmksrwmyxzwjjxftytstsijyrjxynytizsxgspqrxkfhumsdhtknndjsynyyudfydizjjn
3(d)
0.0324
16
0.0416
fwfslsdhssknfxwx
4(e)
0.0368
17
0.0392
0.0615
f
计算密文子串3执行26次移位算法的26个
拟重合指数!
所以Virginia加密密钥中的第三个字母为”f”
5( )
18
0.0405
6
0.0433
19
0.0361
7
0.0332
20
0.0461
8
0.0279
21
0.0386
9
0.0468
22
0.0356
10
0.0384
23
0.0313
依此类推,可求出7个密文子串的所对应的Virginia加
11
0.0365
24
0.0364
密的密钥为”infosec”
12
0.0356
25
0.0429
13
0.0368
26(a)
0.0340
编程任务要求
• 编程语言为C语言。
• 实现对任意有意义的英文文本文件(*.txt)
的Virginia加密、解密算法,其中秘钥是任
意输入的一个字符串。
• 在不知道秘钥的情况下,对一个用Virginia
加密算法生成的密文文本文件进行破解,
包括破解秘钥、生成对应的明文。
The End
Thank you!
Related documents